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EXPRESS MAIL NO: EL485954545US 
METHODS OF IDENTIFYING MODULATORS OF BROMODOMAINS 



FIELD OF THE INVENTION 

5 

The present invention provides the three-dimensional structure of a histone 
acetyltransferase bromodomain. The three-dimensional structural information is 
included in the invention. The present invention also identifies for the first time, that 
bromodomains can bind to an acetylated binding partners. The interaction between 

10 bromodomains and their binding partners play a crucial role in various cellular 

functions, including in the regulation/modulation of DNA transcription. Therefore, 
the present invention provides procedures for identifying agents that can modulate the 
interaction of bromodomains and their binding partners by high throughput drug 
screening and/or through the use of rational drag design based on the three- 

1 5 dimensional data provided herein. 

BACKGROUND OF THE INVENTION 

In recent years great strides have been made in the elucidation of the steps involved in 
20 intercellular and intracellular signaling. Indeed, the individual steps of the cascade of 
events involved in a number of cellular signal transduction processes have been 
determined. For example, intercellular signal transduction generally begins with an 
intercellular ligand binding the extracellular portion of a receptor of the plasma 
membrane. The bound receptor then either directly or indirectly initiates the 
25 activation of one or more cellular factors. An activated cellular factor may act as 

transcription factor by entering the nucleus to interact with its corresponding genomic 
response element, or alternatively, it may interact with other cellular factors 
depending on the complexity of the process. In either case, one or more transcription 
factors ultimately bind to one or more specific genomic response elements. This 
30 binding plays a crucial role in the up and/or down regulation of the transcription of 
the specific genes that are under the control of these genomic response elements. 
However, the process of re-organizing the chromatin of eukaryotic cells, which is a 
prerequisite for the binding of the transcription factor to the genomic response 
elements, has remained a mystery. 
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Chromatin contains several highly conserved histone proteins including: H3, H4, 
H2A, H2B, and HI. These histone proteins package eukaryotic DNA into repeating 
nucleosomal units that are folded into higher-order chromatin fibers [Luger and 
5 Richmond, Curr. Opin. Genet Dev. 8: 140-146 (1998)]. A portion of the histone that 
comprises roughly a quarter of the protein protrudes from the chromatin surface, and 
is thereby sensitive to proteolytic enzymes [van Holde, in Chromatin (Rich, A,, ed., 
Springer, New York ) pagesl 1 1-148 (1988); Hect et al, Cell 80:583-592 (1995)]. 
This portion of the histone is known as the "histone tail". Histone tails tend to be free 
10 for protein-protein interaction, and are also the portion of the histone most prone to 
post-translational modification. Such post-translational modification includes 
acetylation, phosphorylation, methylation, ubiquitination, and ADP-ribosylation [van 
Holde, in Chromatin (Rich, A 3 . ed., Springer, New York ) pagesl 11-148 (1988)]. 

15 Of all classes of proteins, histones are amongst the most susceptible to post- 
translational modification. Perhaps the best studied post-translational modification of 
histones is the acetylation of specific lysine residues [Grunstin, M., Nature, 389:349- 
352 (1997)]. Indeed, acetylation of histone lysine residues has been suggested to 
play a pivotal role in chromatin remodeling and gene activation. Consistently, 

20 distinct classes of enzymes, namely histone acetyltransferases (HATs) and histone 
deacetylases (HDACs), acetylate or de-acetylate specific histone lysine residues 
[Struhl, Genes Dev. 12:599-606 (1998)]. 

Nearly all known nuclear HATs contain an approximately 110 amino acid sequence 
25 known as the bromodomain [Jeanmougin et ah, Trends in Biochemical Sciences, 
22:151-153 (1997)], a protein motif that was initially discovered in Drosophila 
brahma protein. Bromodomains are found in a large number of chromatin-associated 
proteins and have now been identified in approximately 40 proteins, often adjacent to 
other protein motifs [Jeanmougin et al, Trends in Biochemical Sciences, 22:151-153 
30 (1997); Tamkun et al, Cell, 68:561-572 (1992): Hanes et al, Nucleic Acids Research, 
20:2603 (1992)]. Proteins that contain a bromodomain often contain a second 
bromodomain. However, despite the wide occurrence of bromodomains and their 



3 

likely role in chromatin regulation, their three-dimensional structure and binding 
partners heretofore have remained unknown. 



Therefore, there is a need to identify a binding partner for a bromodomain. In 
5 addition, there is a need to identify agonists or antagonists to the bromodomain- 
binding partner complex. Since a preferred method of drug-screening relies on 
structure based drug design, there is also a need to determine the three-dimensional 
structure of a bromodomain. In this case, once the three dimensional structure of 
bromodomain is determined, potential agonists and/or potential antagonists can be 

10 designed with the aid of computer modeling [Bugg et aL, Scientific American, 

Dec.:92-98 (1993); West et aL, TIPS, 16:67-7 r 4 (1995); Dunbrack et aL, Folding & 
Design, 2:27-42 (1997)]. However, heretofore the three-dimensional structure of the 
bromodomain has remained unknown. Therefore, there is a need for obtaining a form 
of the bromodomain that is amenable for NMR analysis and/or X-ray crystallographic 

15 analysis. Furthermore, there is a need for the determination of the three-dimensional 
structure of the bromodomain. Finally, there is a need for procedures for related 
structural based drug design predicated on such structural data. 

The citation of any reference herein should not be construed as an admission that such 
20 reference is available as "Prior Art' 1 to the instant application. 

SUMMARY OF THE INVENTION 

The present invention provides, for the first time, that bromodomains bind to acetyl- 
25 lysine residues of proteins. The present invention also provides the three-dimensional 
structure of a bromodomain as well as the three-dimensional structure of a 
bromodomain-acetyl-histamine complex. The structural information provided can be 
employed in methods of identifying drugs that can modulate the cellular processes 
that involve bromodomain-acetyl-lysine interactions. These interactions include 
30 chromatin remodeling, which is a required step in eukaryotic transcription. In a 
particular embodiment, the three-dimensional structural information is used in the 
design of an inhibitor of leukemia. 



4 

The present invention provides an isolated nucleic acid that encodes a peptide 
consisting of about 21 to 40 amino acids that comprises a ZA loop of a bromodomain. 
In a preferred embodiment the peptide comprises about 23 to 34 amino acids. The 
isolated nucleic acid can further comprise a heterologous nucleotide sequence. 

5 

In a preferred embodiment the peptide comprises the amino acid sequence of SEQ ID 
NO: 3. In another embodiment the peptide comprises the amino acid sequence of SEQ 
ID NO:43. In particular embodiments the ZA loop is obtained from the bromodomain 
having the amino acid sequence of SEQ ID NO:7, or SEQ ID NO:8 5 or SEQ ID NO:9, 

10 or SEQ ID NO: 10, or SEQ ID NO:ll, or SEQ ID NO:12, or SEQ ID NO:13, or SEQ 
ID NO:14, or SEQ ID NO:15, or SEQ ID NO:16, or SEQ ID NO:17, or SEQ ID 
NO:18 5 or SEQ ID NO:19, or SEQ ID NO:20, or SEQ ID NO:21,or SEQ ID NO: 22, 
or SEQ ID NO:23, or SEQ ID NO:24, or SEQ ID NO:25, or SEQ ID NO:26, or SEQ 
ID NO:27, or SEQ ID NO:28 5 or SEQ ID NO:29, or SEQ ID NO:30, or SEQ ID NO: 

15 or SEQ ID NO:31, or SEQ ID NO:32,or SEQ ID NO: 33, or SEQ ID NO:34, or SEQ 
ID NO:35, or SEQ ID NO:36 , or SEQ ID NO:37, or SEQ ID NO:38, or SEQ ID NO: 
or SEQ ID NO:39, or SEQ ID NO:40, or SEQ ID NO:41, or SEQ ID NO:42. 

The present invention farther provides a recombinant DNA molecule that comprises 
20 an isolated nucleic acid of the present invention, as described above, with or without a 
heterologous nucleotide sequence. Such a recombinant DNA molecule can be 
operatively linked to an expression control sequence and can be part of an expression 
vector. The present invention further provides a cell that comprises such an 
expression vector. The cell can be either a eukaryotic or a prokaryotic cell. The 
25 present invention further provides a method of expressing the peptides of the present 
invention or fragments thereof in this cell. One such method comprises culturing the 
cell in an appropriate cell culture medium under conditions that provide for 
expression of the peptide by the cell. 

30 The present invention further provides a peptide consisting of about 21 to 40 amino 
acids that comprises a ZA loop of a bromodomain. In a preferred embodiment the 
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peptide comprises about 23 to 34 amino acids. The present invention also provides 
fusion proteins or peptides comprising these peptides. 



In a preferred embodiment the peptide comprises the amino acid sequence of SEQ ID 
5 NO:3. In another embodiment the peptide comprises the amino acid sequence of SEQ 
ED NO:43. In particular embodiments the ZA loop is obtained from the bromodomain 
having the amino acid sequence of SEQ ID NO:7, or SEQ ID NO:8, or SEQ ID NO:9, 
or SEQ ID NO:10, or SEQ ID NO:ll, or SEQ ID NO:12, or SEQ ID NO:13, or SEQ 
ID NO: 14, or SEQ ID NO:15, or SEQ ID NO: 16, or SEQ ID NO:17 5 or SEQ ID 

10 NO: 18, or SEQ ID NO:19, or SEQ ID NO:20, or SEQ ID NO:21,or SEQ ID NO: 22, 
or SEQ ID NO:23, or SEQ ID NO:24, or SEQ ID NO:25, or SEQ ID NO:26 5 or SEQ 
ID NO:27, or SEQ ID NQ:28, or SEQ ID NO:29, or SEQ ID NO:30, or SEQ ID NO; 
or SEQ ID NO:31, or SEQ ID NO:32,or SEQ ID NO: 33, or SEQ ID NO:34, or SEQ 
ID NO:35, or SEQ ID NO:36 , or SEQ ID NO:37, or SEQ ID NO:38, or SEQ ID NO: 

15 or SEQ ID NO:39, or SEQ ID NO:40, or SEQ ID NO:41, or SEQ ID NO:42. 

The present invention also provides antibodies raised against the peptides/proteins of 
the present invention, or raised against an antigenic fragment of these 
proteins/fragments. In a particular embodiment an antibody is raised against a 

20 fragment of the ZA loop of a bromodomain. In another embodiment an antibody is 
raised against a fragment of a protein or peptide that comprises an acetyl-lysine, 
wherein the protein or peptide can bind to a bromodomain. Such fragments can be 
conjugated to a carrier protein or be part of a fusion protein. In one embodiment the 
antibody is a polyclonal antibody. In another embodiment, the antibody is a 

25 monoclonal antibody. A hybridoma that makes the monoclonal antibody is also part 
of the present invention. In a particular embodiment the antibody is a chimeric 
antibody. Antibodies that can specifically recognize acetyl-lysine residues involved 
bromodomain binding are also part of the present invention. 

30 In another aspect of the present invention a method is provided for identifying a 

compound that modulates the affinity of a bromodomain for a ligand (and/or protein) 
that comprises an acetylated lysine. One such embodiment comprises contacting the 
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bromodomain and the ligand in the presence of a compound under conditions that , 
the bromodomain and the ligand bind in the absence of the compound. The affinity of 
the bromodomain for the ligand is then determined (e.g. , measured). A compound is 
identified as a compound that modulates the affmty of the bromodomain for the 
5 ligand when there is a change in the affinity of the bromodomain for the ligand in the 
presence of the compound. When the affinity of the bromodomain for the ligand 
increases in the presence of the compound, the compound is identified as a promoting 
agent for the bromodomain-ligand complex. When the affinity of the bromodomain 
for the ligand decreases in the presence of the compound, the compound is identified 

10 as an inhibitor of the bromodomain-ligand complex. In a preferred embodiment, the 
compound to be tested is pre-selected by performing rational drug design with the set 
of atomic coordinates obtained from one or more of Tables 1-6. More preferably the 
selecting is performed in conjunction with computer modeling. In a particular 
embodiment, the compound is selected by performing rational drug design with the 

15 set of atomic coordinates obtained from a set of atomic coordinates defining the three- 
dimensional structure of a bromodomain consisting of the amino acid sequence of 
SEQ ID NO:7 alone or with acetyl-histamine. 

The present invention also provides a method of identifying a compound that 
20 modulates the stability of a bromodomain-acetyl-lysine binding complex. One such 
embodiment comprises contacting the bromodomain-acetyl-lysine binding complex in 
the presence of the compound under conditions in which the bromodomain-acetyl- 
lysine binding complex forms in the absence of the compound. The stability of the 
bromodomain-acetyl-lysine binding complex is then determined (e.g., measured). A 
25 compound is identified as a compound that modulates the stability of the 

bromodomain-acetyl-lysine binding complex, when there is a change in the stability 
of the bromodomain-acetyl-lysine binding complex in the presence of that compound. 
When the stability of the bromodomain-acetyl-lysine binding complex increases in the 
presence of the compound, the compound is identified as a stabilizing agent. When 
30 the stability of the bromodomain-acetyl-lysine binding complex decreases in the 

presence of the compound, the compound is identified as an inhibitor. In a preferred 
embodiment, the compound to be tested is pre-selected by performing rational drug 
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design with the set of atomic coordinates obtained from one or more of Tables 1-6. 
More preferably the selecting is performed in conjunction with computer modeling. 
In a particular embodiment, the compound is selected by performing rational drug 
design with the set of atomic coordinates obtained from a set of atomic coordinates 
5 defining the three-dimensional structure of a bromodomain consisting of the amino 
acid sequence of SEQ ID NO:7 alone or with acetyl-histamine. 

As anyone having skill in the art of drug development would readily understand, the 
potential drugs selected by the above methodologies can be refined by re-testing in 

10 appropriate drug assays, including those disclosed herein. Chemical analogs of such 
potential drugs can be obtained (either through chemical synthesis or drag libraries) 
and be analogously tested,. Therefore, methods comprising successive iterations of the 
steps of the individual drug assays, as exemplified herein, using either repetitive or 
different binding studies, or transcription activation studies or other such studies are 

15 envisioned in the present invention. In addition, potential drugs may be identified 
first by rapid throughput drug screening, as described below, prior to performing 
computer modeling on a potential drag using the three-dimensional structure of the 
bromodomain. 

20 The present invention further comprises all of the potential, selected, and putative 
compounds (drugs) identified by the methods of the present invention, as well as the 
final drugs themselves identified with the methods of the present invention. 

The present invention further provides a method for identifying a potential binding 
25 partner for a protein {e.g., a histone) comprising an acetyl-lysine. One such 
embodiment comprises contacting the protein with a polypeptide comprising a 
bromodomain. In a preferred embodiment the bromodomain comprises the amino 
acid sequence of SEQ ID NO:3. In particular embodiments the bromodomain has the 
amino acid sequence of SEQ ID NO:7, or SEQ ID NO:8, or SEQ ID NO:9, or SEQ ID 
30 NO: 10, or SEQ ID NO:ll, or SEQ ID NO:12, or SEQ ID NO:13, or SEQ ID NO: 14, 
or SEQ ID NO:15, or SEQ ID NO: 16, or SEQ ID NO:17, or SEQ ID NO:18, or SEQ 
ID NO:19, or SEQ ID NO:20, or SEQ ID NO:21,or SEQ ID NO: 22, or SEQ ID 
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NO:23, or SEQ ID NO:24, or SEQ ID NO:25, or SEQ ID NO:26, or SEQ ID NO:27, 
or SEQ ID NO:28, or SEQ ID NO:29, or SEQ ID NO:30, or SEQ ID NO: or SEQ ID 
NO:31, or SEQ ID NO:32,or SEQ ID NO: 33, or SEQ ID NO:34, or SEQ ID NO:35, 
or SEQ ID NO:36 , or SEQ ID NO:37, or SEQ ID NO:38, or SEQ ID NO: or SEQ ID 
5 NO:39, or SEQ ID NO:40, or SEQ ID NO:41, or SEQ ID NO:42. 

The present invention further provides a method for identifying a protein having a 
bromodomain. One such embodiment comprises contacting a cellular extract with a 
peptide comprising an acetyl-lysine. 

10 

The present invention further provides agents that can inhibit the binding of a 
bromodomain with a protein comprising an acetyl-lysine. In one embodiment the 
agent is ISYGR-^ci^-KRRQRR (SEQ ID NO:4). In another embodiment the agent is 
ARKSTGG-AciT-APRKQL (SEQ ID NO:5). In still another embodiment the agent 

15 is QSTSRHK-^c^T-LMFKTE (SEQ ID NO:6). In yet another embodiment the agent 
is an analog of acetyl-lysine such as acetyl-histamine. In still another embodiment the 
agent is an antibody that recognizes an acetyl-lysine of a protein binding partner of a 
bromodomain. In a preferred embodiment the agent is an antibody raised against a 
ZA loop of a bromodomain. These agents can be used as pharmaceuticals in 

20 compositions that contain a pharmaceutically acceptable carrier for example, or in the 
various drug assays of the present invention, serving as controls to demonstrate 
specificity. 

Accordingly, it is a principal object of the present invention to provide the three- 
25 dimensional coordinates of a bromodomain. 

It is a further object of the present invention to provide the three-dimensional 
coordinates of a bromodomain complexed with acetyl-histamine. 

30 It is a further object of the present invention to provide an assay for identifying 
proteins that contain bromodomains that bind proteins that comprise acetyl-lysine. 
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It is a further object of the present invention to provide methods of identifying drugs 
that can modulate the bromodomain-acetyl- lysine binding complex. 

It is a further object of the present invention to provide methods of identifying drugs 
5 that can inhibit the binding of a bromodomain to a protein containing acetyl-lysine. 

It is a farther object of the present invention to provide methods that incorporate the 
use of rational design for identifying such drugs. 

10 It is a further object of the present invention to provide a method of identifying drugs 
that can treat leukemia. 

It is a further object of the present invention to provide a method of identifying drugs 
that can treat and/or prevent AIDS. 

15 

These and other aspects of the present invention will be better appreciated by 
reference to the following drawings and Detailed Description. 

BRIEF DESCRIPTION OF THE DRAWINGS 

20 

Figure 1. Structure-based sequence alignment of a selected number of bromodomains. 
The sequences were aligned based on the NMR-derived structure of the P/CAF 
bromodomain, and the predicated four a -helices are shown in green boxes. 
Bromodomains are grouped on the basis of the sequence and/or functional similarities 

25 as described by Jeanmougin et al 9 [Trends in Biochemical Sciences, 22:151-153 
(1997)]. Residue numbers of the P/CAF bromodomain are indicated above its 
sequence. Three absolutely conserved residues, corresponding to Pro751, Pro767, and 
Asn803 in the P/CAF bromodomain, are shown in red. Highly conserved residues are 
colored in blue. The residues of the P/CAF bromodomain that interact with 

30 acetyl-histamine, as determined by intermolecular NOEs, are indicated by asterisks. 
The ZA loop, which is critical for acetyl-lysine binding, for each of the indicated 
bromodomains is also identified. The underlined residues were changed individually 
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by site-directed mutagenesis to Ala. Genbank accession numbers for the proteins are 
as indicated in Table 8, in the Example below, along with the SEQ ID NOs. for the 
bromodomain sequences, 

5 Figures 2A-2H depict the structure of the P/CAF bromodomain. Figures 2A-2B 
shows the stereoview of the C a trace of 30 superimposed NMR-derived structures of 
the bromodomain (residues 722-830). The N-terminal four residues (SKEP) which 
are structurally disordered are omitted for clarity. For the final 30 structures, the 
root-mean-square deviations (RMSDs) of the backbone and all heavy atoms are 0.63 

10 ± 0.1 lA and 1.15 ± 0.12A for residues 723-830, respectively. The RMSDs of the 
backbone and all heavy atoms for the four a-helices (residues 727-743, 770-776, 
785-802, and 807-827), are 0.34 ± 0.04A and 0.87 ± 0.06A, respectively. Figures 2C- 
2D show the stereoview of the bromodomain structures from the bottom of the 
protein, which is rotated approximately 90° from the orientation in Figures 2A-2B. 

15 Figure 2E shows the Ribbons [Carson, M., J. Appl Crystallogr. 24:958-961 (1991)] 
depiction of the averaged minimized NMR structure of the P/CAF bromodomain. 
The orientation of Figure 2E is as shown in Figures 2A-2B. Figures 2F-2G are 
schematic representations of the overall topology of the up-and-down four-helix 
bundle folds with the opposite handedness. The left-handed fold is seen in 

20 bromodomain, cytochrome b 59 and T4 lysozyme (left, Figure 2F), whereas the 
right-handed four-helix bundles are observed in proteins such as hemerythrin and 
cytochrome b 562 (right, Figure 2G) [Richardson, J., Adv.Protein Chern., 34:167-339 
(1989); Presnell and Cohen, Proc. Natl Acad, Set USA 86:6592-6596 (1989)]. 
Figure 2H is a molecular surface representation of the electrostatic potential (blue = 

25 positive; red = negative) of the bromodomain calculated in GRASP [Nicholls et ah, 
Biophys. J. 64:166-170 (1993)]. The hydrophobic and aromatic residues (Tyr809, 
Tyr802, Tyr760, Ala757, and Val752) located between the ZA and BC loops are 
indicated. 

30 Figures 3A-3C show the binding of the P/CAF bromodomain to AcK. Figure 3 A 
shows the superimposed region of the 2D 15 N-HSQC spectra of the bromodomain 
(approximately 0.5 mM) in its free form (red) and complexed to the AcK-containing 
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H4 peptide (molar ratio 1 :6) (black). Figure 3B is the Ribbon and dotted-surface 
diagram of the bromodomain depicting the location of the lysine-acetylated H4 
peptide binding site. The color coding reflects the chemical shift changes {AS) of the 
backbone amide *H and 15 N resonances upon binding to the AcK peptide as observed 
5 in the 15 N-HSQC spectra. The normalized weighted average of the chemical shift 
changes was calculated by AJA ma = \AS^ + Atf N /25)/2] m /A max , where A max is the 
maximum weighted chemical shift difference observed for Tyr809 (0.16ppm). The 
backbone atoms are color-coded in red, yellow, or green for residues that have 
AJA max of>0.6 (Tyr809, Glu808, Asn803, and Ala757), 0.2-0.6 (Ala813, Tyr802, 
10 Tyr760, and Val752), or <0.2 (Cys812, Ser807, Cys799, Phe796, and Phe748), 
respectively. The non-perturbed residues are shown in blue. Figure 3C shows the 
chemical structures of acetyl-lysine, acetyl-histamine, and acetyl-histidine. 

Figure 4 depicts the acetyl- lysine binding pocket. This is the Ribbons [Carson, M., J. 
15 Appl. Crystallogr. 24:958-961 (1991)] depiction of a portion of the P/CAF 

bromodomain complexed with the acetyl-histamine. The ligand is color-coded by 
atom type. 

DETAILED DESCRIPTION OF THE INVENTION 

20 

The present invention identifies a general binding partner (ligand) for the protein 
motif known as the bromodomain. Indeed, by combining structural and site-directed 
mutagenesis studies the present invention demonstrates that bromodomains can 
interact specifically with acetyl-lysine (AcK), making them the first protein modules 

25 known to exhibit such interactions. Like other modular domains, such as Src 

homology-2 (SH2) and phosphotyrosine binding (PTB) domains, which specifically 
interact with phosphotyrosine-containing proteins, the bromodomain/acetyl-lysine 
recognition provides a means to regulate protein-protein interactions via protein lysine 
acetylation. The nature of the acetyl-lysine recognition by the bromodomain is 

30 similar to that of histone acetyltransferase interaction with acetyl-CoA. The present 
invention therefore couples for the first time, the functionality of the bromodomain 
with the HAT activity of coactivators in the regulation of gene transcription. 
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The present invention further provides both a nuclear magnetic resonance (NMR) 
structure of the bromodomain from the HAT coactivator P/CAF 
(p300/CBP-associated factor) as well as the structure for the P/CAF bromodomain in 
complex with acetyl-histamine. The structure reveals an unusual left-handed 
5 up-and-down four-helix bundle. 

The results disclosed herein explain prior deletion experiments which showed that the 
bromodomain is indispensable for the function of GCN5 in yeast. 
Bromodomain- AcK binding also appears to be important for the assembly and activity 
10 of multiprotein complexes in transcriptional activation. The results reported herein 
therefore, form the foundation for identifying specific biological ligands and for 
defining the molecular mechanisms by which the extensive family of bromodomains 
participate in chromatin remodeling and transcriptional activation 

15 As disclosed herein, the binding partner for the bromodomain is a peptide or protein 
comprising an acetyl-lysine (AcK). Interestingly, whereas a free acetyl-lysine does not 
appear to bind the bromodomain, an analog of the acetyl-lysine, acetyl-histamine, 
does. This is most likely due to the additional charge present in the free amino acid. 
Consistently, free acetyl-histidine also does not to bind the bromodomain. 

20 

The present invention further provides a key region of the bromodomain for the 
interaction with its acetyl-lysine binding partner, the ZA loop. The amino acid 
sequence of the ZA loop is defined in Figure 1 for a number of bromodomains and is 
depicted in Figure 2A for P/CAF. In a particular embodiment, the ZA loop has 
25 between about 21 and 40 amino acid residues comprising the amino acid sequence of : 

F X 2 _3 P X 5 . 8 J P/K/H X Y J Y/F/H X 5 P J IvWV D (SEQ ID NO:3) 

more preferably the ZA loop has about 23 to 34 amino acid residues and comprises the 
30 amino acid sequence: 

X 2 F X 2 _ 3 P X 5 . 8 J P/K/H X Y Jy/F/H X 5 P W D (SEQ ID NO:43) 
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(1) The single letter amino acid code is used in this description, z.e.,"F" for 
phenylalanine; "P" for proline; "Y" for tyrosine; and "D" for aspartic acid. 

(2) "X" indicates any amino acid (an undesignated amino acid); and X, X 2 , 
X 2 _ 3 , X 5 ,and X 5 _ 8 indicates one undesignated amino acid, two consecutive undesignated 

5 amino acids, two or three consecutive undesignated amino acids, five consecutive 
undesignated amino acids, and five to eight consecutive undesignated amino acids 
respectively. 

(3) "J" indicates that identity of the amino acid is restricted to a particular 
group, again the one letter code is used 

10 : (i) Jp/K/H i s either proline, lysine or histidine. 



Since this region of the bromodomain is important in binding its acetyl-lysine binding 
1 5 partner, antibodies specifically raised against this region are also included in the 

present invention. In a particular embodiment, the antibody is a humanized chimeric 
antibody that can be used in therapeutic treatment. Thus monoclonal, chimeric, and 
polyclonal antibodies raised against bromodomains, preferably against amino acid 
residues in the ZA loop region are part of the present invention. In a specific 
20 embodiment the antibody is raised against a peptide, fusion peptide or conjugated 
peptide consisting of amino acid residues 746 to 765 of SEQ ID NO:2, i.e., 
WPFMEPVKRTEAPGYYEVIR (SEQ ID NO:44). Such antibodies can be used in the 
treatment of leukemia for example. Alternatively, these antibodies can be used in drug 
discovery assays. 



Thus the present invention provides the first detailed structural information regarding a 
bromodomain and a bromodomain complexed with its acetylated binding partner. The 
present invention therefore provides the three-dimensional structure of the 
bromodomain and a bromodomain acetylated binding partner complex. Since the 
30 interaction of the bromodomain with a histone for example, can play a significant role 
in chromatin remodeling/regulation, the structural information provided herein can be 
employed in methods of identifying drugs that can modulate basic cell processes by 
modulating the transcription. In a particular embodiment, the three-dimensional 



(ii) J, 
(hi) J, 



M/I/V 



is either tyrosine, phenylalanine or histidine. 
is either methionine, isoleucine, or valine. 



25 
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structural information is used in the design of a small organic molecule for the 
treatment of cancer. 

Indeed, the bromodomain and lysine-acetylated protein interaction can now be 
5 implicated to play a causal role in the development of a number of diseases including 
cancers such as leukemia. For example, chromatin remodeling plays a central role in 
the etiology of viral infection and cancer [Archer and Hodin, Curr. Opin. Genet Biol. 
9:171-174 (1999); Jacobson and Pillus, Curr. Opin. Genet Biol 9:175-184 (1999)]. 
Both altered histone acetylation/deacetylation and aberrant forms of chromatin- 

10 remodeling complexes are associated with human diseases. Furthermore, 

chromosomal translocation of various cellular genes with those encoding HATs and 
subunits of chromatin remodeling complexes have been implicated in leukomogenesis. 
The MOZ (monocytic leukemia zinc finger) and MLL/ALL-1 genes are frequently fused 
to the gene encoding the co-activator HAT CBP [Sobulo et aL 9 Proc. Natl. Acad. Sci. 

15 USA 94:8732-8737(1997)]. The resulting fusion protein MLL-CBP contains the 

tandem bromodomain-PHD fmger-HAT domain of CBP. It also has been shown that 
both the bromodomain and HAT domain of CBP are required for leukomogenesis, 
because deletion of either the bromodomain or the HAT domain results in loss of the 
MLL-CBP fusion protein's ability for cell transform. These results indicate that the 

20 CBP bromodomain, and more particularly, the ZA loop of the CBP bromodomain, is 
an excellent target for developing drugs that interfere with the bromodomain acetyl- 
lysine interaction that can be used in the treatment of human acute leukemia. In 
addition, an antibody {e.g., a humanized antibody) raised specifically against a peptide 
from the ZA loop of the CBP bromodomain could also be effective for treating these 

25 conditions. 

Furthermore, the human immunodeficiency virus type 1 (HIV-1) ^raws-activator 
protein, Tat, is absolutely required for productive HIV viral replication [Jeang and 
Gatignol, Curr. Top. Microbiol. Immunol, 188:123-144(1994)]. Recently, it has been 
30 shown that HIV-1 Tat transcriptional activity is tightly regulated by lysine acetylation 
[Kiernan et aL, EMBO Journal 18:6106-61 18 (1999)]. Therefore, the interaction of 
the acetyl-lysine of Tat with one or more bromodomain-containing proteins associated 
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with chromatin remodeling could mediate gene transcription. Thus, the 
bromodomain/lysine-acetylated Tat interaction could also serve as a drug target for 
blocking HIV replication in cells. Similarly, an antibody raised specifically against a 
peptide from the ZA loop of the bromodomain could also be effective for treating these 
conditions. 

In addition, based on the new structural information disclosed herein, the key amino 
acid residues for the binding of a given bromodomain and its binding partner can be 
identified and further elucidated using basic mutagenesis and standard isothermal 
titration calorimetry, for example. In this case, both the crucial amino acids for the 
bromodomain and the binding partner (i.e., apart from the acetyl-lysine) can be readily 
determined and are also part of the present invention. 

The results obtained from the structural and functional studies disclosed herein provide 
the foundation for both high throughput drug screening and structure-based rational 
drug design. The agents identified by this procedure will be useful for ameliorating 
conditions involving chromatin remodeling/regulation as indicated above. 

Structure based rational drug design is the most efficient method of drug development. 
However, heretofore, no information has been disclosed regarding the structure of the 
bromodomain or more importantly, its interaction with the acetyl-lysine of its binding 
partner. Obtaining detailed structural information requires an extensive NMR or X-ray 
crystallographic analysis. By determining and then exploiting the detailed structural 
information of the bromodomain and of the bromodomain/acetyl-histamine 
(exemplified by NMR analysis below) the present invention provides novel methods 
for developing new drugs through structure based rational drug design. 

Thus the present invention provides representative sets of the atomic structure 
coordinates of the free form of the P/CAF bromodomain (Table 5) and of the P/CAF 
bromodomain-acetyl-histamine complex (Table 6) which were both obtained by NMR 
analysis. A Ribbon diagram of the three-dimensional structure of the P/CAF 
bromodomain is depicted in Figure 2E, whereas the P/CAF bromodomain acetyl-lysine 
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binding pocket is depicted in Figure 4. The present invention also provides the NOE- 
derived distance restraints, and NMR chemical shift assignments of the P/CAF 
bromodomain. The NMR chemical shift assignments of the P/CAF bromodomain are 
included in the chemical shift table (Table 1) for the >H- 15 N HSQC spectrum of P/CAF 
bromodomain. The unambiguous NOE-derived Inter-proton Distance Restraints 
(Table 2), the ambiguous NOE-derived Inter-proton Distance Restraints (Table 3) and 
the ! H bonding restraints (Table 4) are also disclosed herein. The sample atomic 
coordinate data provided enable the skilled artisan to practice the invention. In 
addition, Tables 1-6 are also capable of being placed into a computer readable form 
which is also part of the present invention. Furthermore, methods of using these 
coordinates and chemical shifts and related information (including in computer 
readable forms) either individually or together in drug assays are also provided. More 
particularly, such atomic coordinates can be used to identify potential ligands or drugs 
which will modulate the binding of a bromodomain with its binding partner. 

Therefore, if appearing herein, the following terms shall have the definitions set out 
below. 

As used herein a 'bromodomain-acetyl-lysine binding complex" is a binding complex 
between a bromodomain or fragment thereof and either a peptide/polypeptide 
comprising an acetyl-lysine (or an analog of acetyl-lysine), or a free analog of acetyl- 
lysine, such as acetyl-histamine disclosed in the Example below. Preferably, the 
peptide comprises at least six amino acids in addition to the acetyl-lysine. The 
dissociation constant of a bromodomain-acetyl-lysine binding complex is dependent 
on whether the lysine residue or analog thereof is acetylated or not, such that the 
affinity for the bromodomain and the peptide comprising the lysine residue (for 
example) significantly decreases when that lysine residue is not acetylated. 

As used herein a "ZA loop" of a bromodomain is one protion of a bromodomain that is 
involved in the binding of the bromodomain to the acetyl-lysine. The structure of the 
ZA loop of the bromodomain of for P/CAF is depicted in Figure 2 A. The ZA loop has 
between about 20 and 40 amino acids and comprises the amino acid sequence of SEQ 
ID NO:3. More preferably the ZA loop comprises between about 23 to 34 amino acids 
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and has the amino acid sequence SEQ ID NO:43. The amino acid sequence of the ZA 
loop for a representative number of individual bromodomains is shown in Figure 1. 



A "polypeptide" or "peptide" comprising a fragment of a bromodomain, such as the 
5 ZA loop, or a peptide or polypeptide comprising an acetyl-lysine, as used herein can be 
the "fragment" alone, or a larger chimeric or fusion peptide/protein which contains the 
"fragment". 

As used herein the terms "fusion protein" and "fusion peptide" are used 
10 interchangeably and encompass "chimeric proteins and/or chimeric peptides" and 
fusion "intein proteins/peptides". A fusion protein comprises at least a portion of a 
protein or peptide of the present invention, e.g., a bromodomain, joined via a peptide 
bond to at least a portion of another protein or peptide including e.g., a second 
bromodomain in a chimeric fusion protein. In a particular embodiment the portion of 
15 the bromodomain is antigenic. Fusion proteins can comprise a marker protein or 
peptide, or a protein or peptide that aids in the isolation and/or purification of the 
protein, for example. 

As used herein, and unless otherwise specified, the terms "agent", "potential drug", 
20 "compound", "test compound" or "potential compound" are used interchangeably, and 
refer to chemicals which potentially have a use as an inhibitor or activator/stabilizer of 
bromodomain-acetyl-lysine binding. Therefore, such "agents", "potential drugs", 
"compounds" and "potential compounds" may be used, as described herein, in drug 
assays and drug screens and the like. 

25 

As used herein a "small organic molecule" is an organic compound, including a 
peptide [or organic compound complexed with an inorganic compound (e.g., metal)] 
that has a molecular weight of less than 3 Kilodaltons. Such small organic molecules 
can be included as agents, etc. as defined above. 

30 

As used herein the term "binds to" is meant to include all such specific interactions that 
result in two or more molecules showing a preference for one another relative to some 
third molecule. This includes processes such as covalent, ionic, hydrophobic and 
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hydrogen bonding but does not include non-specific associations such as solvent 
preferences. 

As used herein the term "about" signifies that a value is within twenty percent of the 
5 indicated value i.e., a peptide containing "about" 20 amino acid residues can contain 
between 16 and 24 amino acid residues. 

General Techniques for Constructing Nucleic Acids That Encode the Bromodomains 
and Fragments Thereof (Iacuding. ZA Loops); and the Bromodomain Binding 
10 Partners of the Present Invention. 

In accordance with the present invention there may be employed conventional 
molecular biology, microbiology, and recombinant DNA techniques within the skill of 
the art. Such techniques are explained fully in the literature. See, e.g., Sambrook, 

15 Fritsch & Maniatis, Molecular Cloning: A Laboratory Manual, Second Edition (1989) 
Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York (herein 
"Sambrook et at, 1989"); DNA Cloning: A Practical Approach, Volumes I and II 
(D.N. Glover ed. 1985); Oligonucleotide Synthesis (M.J. Gait ed. 1984); Nucleic Acid 
Hybridization [B.D. Hames & SJ. Higgins eds. (1985)]; Transcription And 

20 Translation [B.D. Hames & S.J. Higgins, eds. (1984)]; Animal Cell Culture [R.L 

Freshney, ed. (1986)]; Immobilized Cells And Enzymes [IRL Press, (1986)]; B. Perbal, 
A Practical Guide To Molecular Cloning (1984); F.M. Ausubel et at (eds.), Current 
Protocols in Molecular Biology \ John Wiley & Sons, Inc. (1994). 

25 Therefore, if appearing herein, the following terms shall have the definitions set out 
below. 

As used herein, the term "gene" refers to an assembly of nucleotides that encode a 
polypeptide, and includes cDNA and genomic DNA nucleic acids. 

30 

A "vector" is a replicon, such as plasmid, phage or cosmid, to which another DNA 
segment may be attached so as to bring about the replication of the attached segment. 
A "replicon" is any genetic element (e.g., plasmid, chromosome, virus) that functions 
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as an autonomous unit of DNA replication in vivo, i.e., capable of replication under its 
own control. 

A "cassette" refers to a segment of DNA that can be inserted into a vector at specific 
restriction sites. The segment of DNA encodes a polypeptide of interest, and the 
cassette and restriction sites are designed to ensure insertion of the cassette in the 
proper reading frame for transcription and translation. 

A cell has been "transfected" by exogenous or heterologous DNA when such DNA has 
been introduced inside the cell. 

A "nucleic acid molecule" refers to the phosphate ester polymeric form of 
ribonucleosides (adenosine, guanosine, uridine or cytidine; "RNA molecules") or 
deoxyribonucleosides (deoxyadenosine, deoxyguanosine, deoxythymidine, or 
deoxycytidine; "DNA molecules"), or any phosphoester analogues thereof, such as 
phosphorothioates and thioesters, in either single stranded form, or a double-stranded 
helix. Double stranded DNA-DNA, DNA-RNA and RNA-RNA helices are possible. 
The term nucleic acid molecule, and in particular DNA or RNA molecule, refers only 
to the primary and secondary structure of the molecule, and does not limit it to any 
particular tertiary forms. Thus, this term includes double-stranded DNA found, inter 
alia, in linear or circular DNA molecules {e.g., restriction fragments), plasmids, and 
chromosomes. In discussing the structure of particular double-stranded DNA 
molecules, sequences may be described herein according to the normal convention of 
giving only the sequence in the 5' to 3' direction along the nontranscribed strand of 
DNA (i.e., the strand having a sequence homologous to the mRNA). A "recombinant 
DNA molecule" is a DNA molecule that has undergone a molecular biological 
manipulation. 

A nucleic acid molecule is "hybridizable" to another nucleic acid molecule, such as a 
cDNA, genomic DNA, or RNA, when a single stranded form of the nucleic acid 
molecule can anneal to the other nucleic acid molecule under the appropriate 
conditions of temperature and solution ionic strength (see Sambrook et ah, supra). 
The conditions of temperature and ionic strength determine the "stringency" of the 
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hybridization. For preliminary screening for homologous nucleic acids, low stringency 
hybridization conditions, corresponding to a T m of 55°, can be used, e.g., 5x SSC, 
0.1% SDS, 0.25% milk, and no formamide; or 30% formamide, 5x SSC, 0.5% SDS). 
Moderate stringency hybridization conditions correspond to a higher T m , e.g., 40% 
formamide, with 5x or 6x SCC. High stringency hybridization conditions correspond 
to the highest T m , e.g., 50% formamide, 5x or 6x SCC. Hybridization requires that the 
two nucleic acids contain complementary sequences, although depending on the 
stringency of the hybridization, mismatches between bases are possible. The 
appropriate stringency for hybridizing nucleic acids depends on the length of the 
nucleic acids and the degree of complementation, variables well known in the art. The 
greater the degree of similarity or homology between two nucleotide sequences, the 
greater the value of T m for hybrids of nucleic acids having those sequences. The 
relative stability (corresponding to higher T m ) of nucleic acid hybridizations decreases 
in the following order: RNA:RNA, DNA:RNA, DNA:DNA. For hybrids of greater 
than 100 nucleotides in length, equations for calculating T ra have been derived {see 
Sambrook et al , supra, 9.50-10.5 1). For hybridization with shorter nucleic acids, i.e. , 
oligonucleotides, the position of mismatches becomes more important, and the length 
of the oligonucleotide determines its specificity {see Sambrook et al, supra, 1 1 .7- 
1 1.8). Preferably a minimum length for a hybridizable nucleic acid is at least about 12 
nucleotides; preferably at least about 18 nucleotides; and more preferably the length is 
at least about 27 nucleotides; and most preferably 36 nucleotides. 

In a specific embodiment, the term "standard hybridization conditions" refers to a T m 
of 55 °C, and utilizes conditions as set forth above. In a preferred embodiment, the T m 
is 60 °C; in a more preferred embodiment, the T m is 65 °C. 

A DNA "coding sequence" is a double-stranded DNA sequence which is transcribed 
and translated into a polypeptide in a cell in vitro or in vivo when placed under the 
control of appropriate regulatory sequences. The boundaries of the coding sequence 
are determined by a start codon at the 5' (amino) terminus and a translation stop codon 
at the 3' (carboxyl) terminus. A coding sequence can include, but is not limited to, 
prokaryotic sequences and synthetic DNA sequences. If the coding sequence is 
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intended for expression in a eukaryotic cell, a polyadenylation signal and transcription 
termination sequence will usually be located 3' to the coding sequence. 

Transcriptional and translational control sequences are DNA regulatory sequences, 
such as promoters, enhancers, terminators, and the like, that provide for the expression 
of a coding sequence in a host cell. In eukaryotic cells, polyadenylation signals are 
control sequences. 

A "promoter sequence" is a DNA regulatory region capable of binding RNA 
polymerase in a cell and initiating transcription of a downstream (3' direction) coding 
sequence. For purposes of defining the present invention, the promoter sequence is 
bounded at its 3' terminus by the transcription initiation site and extends upstream (5' 
direction) to include the minimum number of bases or elements necessary to initiate 
transcription at levels detectable above background. Within the promoter sequence 
will be found a transcription initiation site (conveniently defined for example, by 
mapping with nuclease SI ), as well as protein binding domains (consensus sequences) 
responsible for the binding of RNA polymerase. 

A coding sequence is "under the control" of transcriptional and translational control 
sequences in a cell when RNA polymerase transcribes the coding sequence into 
mRNA, which is then trans-RNA spliced and translated into the protein encoded by the 
coding sequence. 

A DNA sequence is "operatively linked" to an expression control sequence when the 
expression control sequence controls and regulates the transcription and translation of 
that DNA sequence. The term "operatively linked" includes having an appropriate 
start signal (e.g., ATG) in front of the DNA sequence to be expressed and maintaining 
the correct reading frame to permit expression of the DNA sequence under the control 
of the expression control sequence and production of the desired product encoded by 
the DNA sequence. If a gene that one desires to insert into a recombinant DNA 
molecule does not contain an appropriate start signal, such a start signal can be inserted 
in front of the gene. 
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As used herein, the term "homologous" in all its grammatical forms refers to the 
relationship between proteins that possess a "common evolutionary origin," including 
proteins from superfamilies (e.g., the immunoglobulin superfamily) and homologous 
proteins from different species (e.g., myosin light chain, etc.) [Reeck et al, Cell 
50:667 (1987)]. Such proteins have sequence homology as reflected by their high 
degree of sequence similarity. 

Accordingly, the term "sequence similarity" in all its grammatical forms refers to the 
degree of identity or correspondence between nucleic acid or amino acid sequences of 
proteins that may or may not share a common evolutionary origin (see Reeck et al. , 
supra). However, in common usage and in the instant application, the term 
"homologous," when modified with an adverb such as "highly," may refer to sequence 
similarity and not a common evolutionary origin. 

Two DNA sequences are "substantially homologous" when at least about 60% 
(preferably at least about 80%, and most preferably at least about 90 or 95%) of the 
nucleotides match over the defined length of the DNA sequences. Sequences that are 
substantially homologous can be identified by comparing the sequences using standard 
software available in sequence data banks, or in a Southern hybridization experiment 
under, for example, stringent conditions as defined for that particular system. Defining 
appropriate hybridization conditions is within the skill of the art. See, e.g., Maniatis et 
al., supra; DNA Cloning, Vols. I & II, supra; Nucleic Acid Hybridization, supra. 

As used herein an amino acid sequence is 100% "homologous" to a second amino acid 
sequence if the two amino acid sequences are identical, and/or differ only by neutral or 
conservative substitutions as defined below. Accordingly, an amino acid sequence is 
50% "homologous" to a second amino acid sequence if 50% of the two amino acid 
sequences are identical, and/or differ only by neutral or conservative substitutions. 

As used herein, DNA and protein sequence percent identity can be determined using 
MacVector 6.0.1, Oxford Molecular Group PLC (1996) and the Clustal W algorithm 
with the alignment default parameters, and default parameters for identity. These 
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commercially available programs can also be used to determine sequence similarity 
using the same or analogous default parameters. 

The term "corresponding to" is used herein to refer similar or homologous sequences, 
5 whether the exact position is identical or different from the molecule to which the 
similarity or homology is measured. Thus, the term "corresponding to" refers to the 
sequence similarity, and not the numbering of the amino acid residues or nucleotide 
bases. 

10 As used herein a "heterologous nucleotide sequence" is a nucleotide sequence that is 
added to a nucleotide sequence of the present invention by recombinant methods to 
form a nucleic acid which is not naturally formed in nature. Such nucleic acids can 
encode fusion proteins or peptides, including chimeric proteins and peptides. Thus the 
heterologous nucleotide sequence can encode peptides and/or proteins which contain 

15 regulatory and/or structural properties. In another such embodiment the heterologous 
nucleotide can encode a protein or peptide that functions as a means of detecting the 
protein or peptide encoded by the nucleotide sequence of the present invention after the 
recombinant nucleic acid is expressed. In still another such embodiment the 
heterologous nucleotide can function as a means of detecting a nucleotide sequence of 

20 the present invention. A heterologous nucleotide sequence can comprise non-coding 
sequences including restriction sites, regulatory sites, promoters and the like. 

The present invention also relates to cloning vectors containing nucleic acids encoding 
analogs and derivatives of the bromodomains of the present invention and 
25 polypeptides/peptides that can bind a bromodomain when a lysine of the 

polypeptide/peptide is acetylated, including modified fragments, that have the same 
or homologous functional activity as the individual fragments, and homologs thereof. 
The production and use of derivatives and analogs related to the fragments are within 
the scope of the present invention. 

30 

Due to the degeneracy of nucleotide coding sequences, other DNA sequences which 
encode substantially the same amino acid sequence as a nucleic acid encoding a protein 
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comprising bromodomain or bromodomain binding partner {i.e., when post- 
transcriptionally acetylated) of the present invention for example, may be used in the 
practice of the present invention. These include but are not limited to allelic genes, 
homologous genes from other species, which are altered by the substitution of different 
codons that encode the same amino acid residue within the sequence, thus producing a 
silent change. Likewise, the peptides and polypeptides of the present invention 
include, but are not limited to, those containing, as a primary amino acid sequence, 
analogous portions of their respective amino acid sequences including altered 
sequences in which functionally equivalent amino acid residues are substituted for 
residues within the sequence resulting in a conservative amino acid substitution. For 
example, one or more amino acid residues within the sequence can be substituted by 
another amino acid of a similar polarity, which acts as a functional equivalent, 
resulting in a silent alteration. Substitutes for an amino acid within the sequence may 
be selected from other members of the class to which the amino acid belongs. For 
example, the nonpolar (hydrophobic) amino acids include alanine, leucine, isoleucine, 
valine, proline, phenylalanine, tryptophan and methionine. Amino acids containing 
aromatic ring structures are phenylalanine, tryptophan, and tyrosine. The polar neutral 
amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutamine. The positively charged (basic) amino acids include arginine, and lysine. 
The negatively charged (acidic) amino acids include aspartic acid and glutamic acid. 

Particularly preferred conserved amino acid exchanges are: 

(a) Lys for Arg or vice versa such that a positive charge may be maintained; 

(b) Glu for Asp or vice versa such that a negative charge may be maintained; 

(c) Ser for Thr or vice versa such that a free -OH can be maintained; 

(d) Gin for Asn or vice versa such that a free NH 2 can be maintained; 

(e) He for Leu or for Val or vice versa as roughly equivalent hydrophobic amino acids; 
and 

(f) Phe for Tyr or vice versa as roughly equivalent aromatic amino acids. 

A conservative change generally leads to less change in the structure and function of 
the resulting protein. A non-conservative change is more likely to alter the structure, 
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activity or function of the resulting protein. The present invention should be 
considered to include sequences containing conservative changes which do not 
significantly alter the activity or binding characteristics of the resulting protein. 
Specific amino acid residues for the P/CAF bromodomain have been identified that are 
important for binding, indicating a potential lower stringency for the substitution of the 
remaining amino acids residues. 

All of the peptides/fragments of the present invention can be modified by being placed 
in a fusion or chimeric peptide or protein, or labeled e.g., to have an N-terminal FLAG- 
tag, or H6 tag. In a particular embodiment the P/CAF bromodomain fragment can be 
modified to contain a marker protein such as green fluorescent protein as described in 
U.S. Patent No. 5,625,048 filed April 29, 1997 and WO 97/26333, published July 24, 
1997 each of which are hereby incorporated by reference herein in their entireties. 

The nucleic acids encoding peptides and protein fragments of the present invention and 
analogs thereof can be produced by various methods known in the art. The 
manipulations which result in their production can occur at the gene or protein level 
[Sambrook et al, 1989, supra]. The nucleotide sequence can be cleaved at appropriate 
sites with restriction endonuclease(s), followed by further enzymatic modification if 
desired, isolated, and ligated in vitro. In addition a nucleic acid sequence can be 
mutated in vitro or in vivo, to create and/or destroy translation, initiation, and/or 
termination sequences, or to create variations in coding regions and/or form new 
restriction endonuclease sites or destroy preexisting ones, to facilitate further in vitro 
modification. Any technique for mutagenesis known in the art can be used, including 
but not limited to, in vitro site-directed mutagenesis [Hutchinson et al 9 J. Biol. Chem., 
253:6551 (1978); Zoller and Smith, DNA, 3:479-488 (1984); Oliphant et al, Gene, 
Ab\\ll (1986); Hutchinson et al 9 Proa Natl Acad. Sci. U.S.A., 83:710 (1986)], use of 
TAB® linkers (Pharmacia), etc. PCR techniques are preferred for site directed 
mutagenesis [see Higuchi, 1989, "Using PCR to Engineer DNA", in PCR Technology: 
Principles and Applications for DNA Amplification, H. Erlich, ed., Stockton Press, 
Chapter 6, pp. 61-70]. 
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The identified and isolated nucleic acids can then be inserted into an appropriate 
cloning vector. A large number of vector-host systems known in the art may be used. 

Protein expression and purification 

5 

A bacterial protein expression system can be used to make various stable isotopically 
labeled ( 13 C, 15 N, and 2 H) protein samples that are useful for a three-dimensional NMR 
structural determination of a protein complex. For example a pET14b (Novagen) 
bacterial expression vector can be constructed which expresses the recombinant P/CAF 
10 bromodomain as an amino-terminal His-tagged fusion protein. 

Protein expression and purification can be conducted using standard procedures for 
His-tagged proteins [Zhou etal,J. Biol Chern. 270:31119-31123 (1995)]. To 
optimize the level of protein expression, various bacterial growth and expression 

1 5 conditions can be screened, which include different E. Coli cell lines, and growth and 
protein induction temperatures. Generally, it is preferred to obtain the maximum 
amount of soluble protein while still inducing protein expression with a relatively low 
IPTG concentration e.g., ~0.2mM (final concentration) at 16°C. As exemplified 
below, the bromodomain of P/CAF (residues 719-832 of SEQ ID NO:2 which is SEQ 

20 ID NO:7) was subcloned into the pET14b expression vector (Novagen) and expressed 
in Escherichia coli BL21(DE3) cells. Uniformly 15 N- and 15 N/ 13 C-labeled proteins 
were prepared by growing bacteria in a minimal medium containing 15 NH 4 C1 with or 
without 13 C 6 -glucose. A uniformly 15 N/ 13 C-labeled and fractionally deuterated protein 
sample was prepared by growing the cells in 75% 2 H 2 0. The bromodomain was 

25 purified by affinity chromatography on a nickel-IDA column (Invitrogen) followed by 
the removal of poly-His tag by thrombin cleavage. The final purification of the protein 
was achieved by size-exclusion chromatography. The acetyl-lysine-containing 
peptides were prepared on a MilliGen 9050 peptide synthesizer (Perkin Elmer) using 
Fmoc/HBTU chemistry. Acetyl-lysine was incorporated using the reagent 

30 Fmoc-Ac-Lys with HBTU/DIPEA activation. NMR samples contained approximately 
1 mM protein in lOOmM phosphate buffer of pH 6.5 and 5mM perdeuterated DTT and 
0.5mM EDTA in H 2 0/ 2 H 2 0 (9/1) or 2 H 2 0. 
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One major advantage of using the heteronuclear multidimensional approach, as 
exemplied herein, is that the NMR resonance assignments of a protein are obtained in a 
sequence-specific manner which assures accuracy and greatly facilitates data analysis 
and structure determination [Clore, G. M. & Gronenborn, A. M. Meth. Enzymol 
239:249-363 (1994)]. In addition, the signal overlapping problems in the protein 
spectra are minimized by the use of multidimensional NMR spectra, which separates 
the proton signals according to the chemical shifts of their attached hetero-nuclei (such 
as 15 N and 13 C). This NMR approach has been proven very powerful for structural 
analysis of large proteins [Clore, G. M. & Gronenborn, A. M. Meth. Enzymol 
239:249-363 (1994)]. To facilitate sequence-specific resonance assignments for the 
structural study, a uniformly 13 C, 15 N-labeled and fractionally (75%) deuterated protein 
sample of the bromodomain can be prepared by growing bacterial cells in 75% 2 H 2 0 as 
exemplified below. Such protein samples can be used for triple-resonance NMR 
experiments. A triple-labeled protein sample is useful for high-resolution NMR 
structural studies. Because of the favorable I H, 13 C, and 15 N relaxation rates caused by 
the partial deuteration of the protein, constant-time triple-resonance NMR spectra can 
be acquired with higher digital resolution and sensitivity [Sattler, M. & Fesik, S. W. 
Structure 4:1245-1249 (1996)]. In addition, various stable-isotopically labeled ( 15 N 
and 13 C / 15 N) proteins can also be prepared using this procedure. 

Synthetic Polypeptides 

The term "polypeptide" is used in its broadest sense to refer to a compound of two or 
more subunit amino acids, amino acid analogs, or peptidomimetics. The subunits are 
linked by peptide bonds. The terms "polypeptide", "protein", and "peptide" are used 
interchangeably herein, though preferably as used herein a "peptide" refers to a 
compound of at least two but less than fifty subunit amino acids, and a polypeptide or 
protein refers to compound of fifty or more amino acids. The polypeptides of the 
present invention may be chemically synthesized or as detailed above, genetically 
engineered or isolated from natural sources. 

In addition, potential drugs or agents that may be tested in the drug screening assays of 
the present invention may also be chemically synthesized. When the peptide is to be 
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modified, e.g., acetylated, the modification can be at any time during the peptide 
synthesis, including using an acetyl-lysine as a starting material or acetylating a lysine 
residue of a peptide after the peptide has been synthesized. In the Example below, the 
acetyl-lysine-containing peptides were prepared on a MilliGen 9050 peptide 
synthesizer (Perkin Elmer) using Fmoc/HBTU chemistry. Acetyl-lysine was 
incorporated using the reagent Fmoc-Ac-Lys with HBTU/DIPEA activation. 

Thus, synthetic polypeptides, prepared using the well known techniques of solid phase, 
liquid phase, or peptide condensation techniques, or any combination thereof, can 
include natural and unnatural amino acids. Amino acids used for peptide synthesis 
may be standard Boc (N a -amino protected N a -t-butyloxycarbonyl) amino acid resin 
with the standard deprotecting, neutralization, coupling and wash protocols of the 
original solid phase procedure of Merrifield [J. Am. Chem. Soc, 85:2149-2154 
(1963)], or the base-labile N tt -amino protected 9-fluorenylmethoxycarbonyl (Fmoc) 
amino acids first described by Carpino and Han [J. Org. Chem., 37:3403-3409 (1972)]. 
Both Fmoc and Boc N a -amino protected amino acids can be obtained from Fluka, 
Bachem, Advanced Chemtech, Sigma, Cambridge Research Biochemical, Bachem, or 
Peninsula Labs or other chemical companies familiar to those who practice this art. In 
addition, the method of the invention can be used with other N a -protecting groups that 
are familiar to those skilled in this art. Solid phase peptide synthesis may be 
accomplished by techniques familiar to those in the art and provided, for example, in 
Stewart and Young [Solid Phase Synthesis, Second Edition, Pierce Chemical Co., 
Rockford, IL (1984)] and Fields and Noble [Int. J. Pept. Protein Res., 35:161-214 
(1990)], or using automated synthesizers, such as sold by ABS. Thus, polypeptides of 
the invention may comprise D-amino acids, a combination of D- and L-amino acids, 
and various "designer" amino acids {e.g., p-methyl amino acids, Ccc-methyl amino 
acids, and Na-methyl amino acids, etc.) to convey special properties. Synthetic amino 
acids include ornithine for lysine, fluorophenylalanine for phenylalanine, and 
norleucine for leucine or isoleucine. Additionally, by assigning specific amino acids at 
specific coupling steps, a-helices, P turns, p sheets, Y-turns, and cyclic peptides can be 
generated. 
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In a further embodiment, subunits of peptides that confer useful chemical and 
structural properties will be chosen. For example, peptides comprising D-amino acids 
will be resistant to L-amino acid-specific proteases in vivo. In addition, the present 
invention envisions preparing peptides that have more well defined structural 
properties, and the use of peptidomimetics, and peptidomimetic bonds, such as ester 
bonds, to prepare peptides with novel properties. In another embodiment, a peptide 
may be generated that incorporates a reduced peptide bond, i.e., R r CH 2 -NH-R 2 , where 

and R 2 are amino acid residues or sequences. A reduced peptide bond may be 
introduced as a dipeptide subunit. Such a molecule would be resistant to peptide bond 
hydrolysis, e.g., protease activity. Such peptides would provide ligands with unique 
function and activity, such as extended half-lives in vivo due to resistance to metabolic 
breakdown, or protease activity. Furthermore, it is well known that in certain systems 
constrained peptides show enhanced functional activity [Hruby, Life Sciences, 31:189- 
199 (1982); Hruby et al 9 Biochem 1, 268:249-262 (1990)]; the present invention 
provides a method to produce a constrained peptide that incorporates random 
sequences at all other positions. 

Constrained and cyclic peptides. A constrained, cyclic or rigidized peptide may be 
prepared synthetically, provided that in at least two positions in the sequence of the 
peptide an amino acid or amino acid analog is inserted that provides a chemical 
functional group capable of crosslinking to constrain, cyclise or rigidize the peptide 
after treatment to form the crosslink. Cyclization will be favored when a turn-inducing 
amino acid is incorporated. Examples of amino acids capable of crosslinking a peptide 
are cysteine to form disulfides, aspartic acid to form a lactone or a lactam, and a 
chelator such as y-carboxyl-glutamic acid (Gla) (Bachem) to chelate a transition metal 
and form a cross-link. Protected y-carboxyl glutamic acid may be prepared by 
modifying the synthesis described by Zee-Cheng and Olson [Biophys. Biochem. Res. 
Commun., 94:1 128-1 132 (1980)]. A peptide in which the peptide sequence comprises 
at least two amino acids capable of crosslinking may be treated, e.g., by oxidation of 
cysteine residues to form a disulfide or addition of a metal ion to form a chelate, so as 
to crosslink the peptide and form a constrained, cyclic or rigidized peptide. 
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The present invention provides strategies to systematically prepare cross-links. For 
example, if four cysteine residues are incorporated in the peptide sequence, different 
protecting groups may be used (Hiskey, in The Peptides: Analysis, Synthesis, Biology, 
Vol. 3, Gross and Meienhofer, eds., Academic Press: New York, pp. 137-167 (1981); 
5 Ponsanti et al, Tetrahedron, 46:8255-8266 (1990)]. The first pair of cysteines may be 
deprotected and oxidized, then the second set may be deprotected and oxidized. In this 
way a defined set of disulfide cross-links may be formed. Alternatively, a pair of 
cysteines and a pair of chelating amino acid analogs may be incorporated so that the 
cross-links are of a different chemical nature. 

10 

Non-classical amino acids that induce conformational constraints. The following non- 
classical amino acids may be incorporated in the peptide in order to introduce 
particular conformational motifs: l,2,3,4-tetrahydroisoquinoline-3-carboxylate 
[Kazmierski et ai 9 J. Am. Chem. Soc, 113:2275-2283 (1991)]; (2S,3S)-methyl- 

15 phenylalanine, (2S,3R)-methyl-phenylalanine, (2R,3S)-methyl-phenylalanine and 
(2R,3R)-methyl-phenylalanine (Kazmierski and Hruby, Tetrahedron Lett (1991)]; 2- 
aminotetrahydronaphthalene-2-carboxylic acid [Landis, Ph.D. Thesis, University of 
Arizona (1989)]; hydroxy-l,2,3 ; 4-tetrahydroisoquinoline-3-carboxylate [Miyake et aL, 
J. Takeda Res. Labs., 43:53-76 (1989)]; p-carboline (D and L) [Kazmierski, Ph.D. 

20 Thesis, University of Arizona (1988)]; HIC (histidine isoquinoline carboxylic acid) 
[Zechel et al 9 Int. J. Pep. Protein Res., 43 (1991)]; and HIC (histidine cyclic urea) 
(Dharanipragada). 

The following amino acid analogs and peptidomimetics may be incorporated into a 
25 peptide to induce or favor specific secondary structures: LL-Acp (LL-3-amino- 

2-propenidone-6-carboxylic acid), a P-turn inducing dipeptide analog [Kemp et al, J. 
Org. Chem., 50:5834-5838 (1985)]; P-sheet inducing analogs [Kemp et al, 
Tetrahedron Lett, 29:5081-5082 (1988); p-turn inducing analogs [Kemp et al 9 
Tetrahedron Lett, 29:5057-5060 (1988)]; <*-helix inducing analogs (Kemp et al, 
30 Tetrahedron Lett, 29:4935-4938 (1988)]; y-turn inducing analogs [Kemp et al, J. 
Org. Chem., 54:109:1 15 (1989)]; and analogs provided by the following references: 
Nagai and Sato, Tetrahedron Lett, 26:647-650 (1985); DiMaio et al. 9 J. Chem. Soc. 
Perkin Trans., p. 1687 (1989); also a Gly-Ala turn analog [Kahn et al, Tetrahedron 
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Lett, 30:2317 (1989)]; amide bond isostere [Jones etaL, Tetrahedron Lett, 29:3853- 
3856 (1988)]; tretrazol [Zabrocki et al, J. Am. Chem. Soc, 110:5875-5880 (1988)]; 
DTC [Samanen et aljnt J. Protein Pep, Res., 35:501:509 (1990)]; and analogs taught 
in Olson et al, J. Am. Chem. Set, 112:323-333 (1990) and Garvey et al. 9 J. Org. 
Chem., 56:436 (1990). Conformational^ restricted mimetics of beta turns and beta 
bulges, and peptides containing them, are described in U.S. Patent No. 5,440,013, 
issued August 8, 1995 to Kahn. 

Structure-based Mutation Analysis 

Protein structural analysis using NMR spectroscopy has several unique advantages. In 
addition to high-resolution three-dimensional structural information, the chemical shift 
assignments for the protein obtained in the structural study further provides a map of 
the entire protein at the atomic level, which can be used for structure-based 
biochemical analysis of protein-protein interactions. For example, the information 
generated from the NMR structural analysis can also serve to identify specific amino 
acid residues in the peptide-binding site for complementary mutagenesis studies. 
Specific focus can be placed on those residues that display long-range NOEs 
(particularly the side-chain NOEs in the 13 C-NOESY data) between the bromoomain 
and a peptide comprising an acetyl-lysine. 

To ensure mutant proteins are valid for functional analysis, it can be determined as to 
whether a mutation results in any significant perturbation of the overall conformation 
of the bromodomain, particularly the effects of mutation on the acetyl-lysine binding 
sites. NMR spectroscopy is a powerful method for examining the effects of such a 
mutation on the conformation of the protein. One can readily obtain information about 
the global conformation of a mutant protein from the proton ( 1 H) ID spectrum, by 
examining the chemical shift dispersion and peak line-width of NMR signals of amide, 
aromatic and aliphatic protons. Moreover, 2D HSQC spectra reveal details of 

the effects of a mutation on both local and global conformation of the protein, since 
every single l W l5r N signal (both the chemical shift and line-shape) in the NMR 
spectrum is a "reporter" for a particular amino acid residue. Thus, to assess how 
mutations effect protein stability and the overall protein conformation, the 15 N HSQC 
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spectra of mutated proteins can be compared to that of the wild-type protein 
bromodomain. 

Chemical-shift perturbations due to ligand binding have proven to be a reliable and 
sensitive probe for the ligand binding site of the protein. This is because the chemical- 
shift changes of the backbone amide groups are likely to reflect any changes in protein 
conformation and/or hydrogen bonding due to the peptide/ligand binding. To examine 
the effects of a mutation on the ligand binding (in this case the ligand is a peptide 
comprising an acetyl-lysine), peptide titration experiments can be conducted by 
following the changes of l W l5 N signals of the mutant proteins as a function of the 
peptide concentration. These experiments indicate whether the acetyl-lysine binding 
site remains the same or changes in the mutants relative to the wild type protein. The 
effects of the mutation on the peptide binding affinity can also be examined by NMR 
spectroscopy. If the mutated proteins result in the reduction of the binding affinity, a 
change of the exchange phenomenon between the free and the ligand-bound signals 
should be observed in NMR spectrum. If the reduction in binding affinity causes the 
peptide binding to change from a slow exchange rate to a fast exchange rate, on the 
NMR time scale, then the peptide binding affinity can be determined from the NMR 
titration experiment. From these mutation analyses key amino acid residues that are 
important for binding a peptide comprising the acetyl-lysine can be identified. Such 
analysis has been exemplified below. 

Protein Structure Determination by NMR Spectroscopy 

The NMR results from the present invention are summarized by the atomic structure 
coordinates of the free foim of the P/CAF bromodomain (Table 5) and of the P/CAF 
bromodomain-acetyl-histamine complex (Table 6). The NMR chemical shift 
assignments of the P/CAF bromodomain are included in the chemical shift table (Table 
1) for the ! H- 15 N HSQC spectrum of P/CAF bromodomain. The unambiguous NOE- 
derived Inter-proton Distance Restraints are in Table 2, the ambiguous NOE-derived 
Inter-proton Distance Restraints are in Table 3, and the l K bonding restraints are 
disclosed in Table 4. 
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Backbone and Side-chain Assignments: Sequence-specific backbone assignment can 
be achieved by using a suite of deuterium-decoupled triple-resonance 3D NMR 
experiments which include HNCA, HN(CO)CA, HN(CA)CB, HN(COCA)CB, HNCO, 
and HN(CA)CO experiments [Yamazaki, et al, J. Am. Chem. Soc. 116:11655-11666 
(1994)]. The water flip-back scheme is used in these NMR pulse programs to 
minimize amide signal attenuation from water exchange. Sequential side-chain 
assignments are typically accomplished from a series of 3D NMR experiments with 
alternative approaches to confirm the assignments. These experiments include 3D 15 N 
TOCSY-HSQC, HCCH-TOCSY, (H)C(CO)NH-TOCSY, and H(C)(CO)NH-TOCSY 
[see Clore, G. M. & Gronenborn, A. M. Meth. Enzymol 239:249-363 (1994);Sattler et 
al.Prog. in Nuclear Magnetic Resonance Spec. 4:93-158 (1999)]. 

Stereospecific Methyl Groups: Stereo specific assignments of methyl groups of Valine 
and Leucine residues can be obtained from an analysis of carbon signal multiplet 
splitting using a fractionally 13 C-labeled protein sample, which can be readily prepared 
using M9 minimal medium containing 10% 13 C-/90% 12 C-glucose mixture [see Neri, et 
al 9 Biochemistry 28:7510-7516 (1989)]. 

Dihedral Angle Restraints: Backbone dihedral angle (4>) constraints can be generated 
from the coupling constants measured in a HNHA-J experiment [see Vuister, G. 

& Bax, A. Am. Chem. Soc. 115:7772-7777 (1993)]. Side-chain dihedral angles 
can be obtained from short mixing time 15 N-edited 3D TOCSY-HSQC [see Clore, et 
al 9 J, Biomol. NMR 1:13-22 (1991)] and 3D HNHB experiments [see Matson et al, J. 
Biomol NMR 3:239-244 (1993)], which can also provide stereospecific assignments of 
P methylene protons. 

Hydrogen Bonds Restraints: Amide protons that are involved in hydrogen bonds can 
be identified from an analysis of amide exchange rates measured from a series of 2D 
] H/ 15 N HSQC spectra recorded after adding 2 H 2 0 to the protein sample. 

NOE Distance Restraints: Distance restraints are obtained from analysis of 15 N, and 
13 C-edited 3D NOESY data, which can be collected with different mixing times to 
minimize spin diffusion problems. The nuclear Overhauser effect (NOE)-derived 
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restraints are categorized as strong (1.8-3 A), medium (1.8-4 A) or weak (1.8-5 A) 
based on the observed NOE intensities. A recently developed procedure for the 
iterative automated NOE analysis by using ARIA [see Nilges et al 9 Prog. NMR 
Spectroscopy 32:107-139 (1998)] can be employed which integrates with X-PLOR for 
5 structural calculations. To ensure the success of ARIA/X-PLOR-assisted NOE analysis 
and structure calculations, the ARIA assigned NOE peaks can be manually confirmed. 

Intermolecular NOE Distance Restrains: For the structural determination of a 
protein/peptide complex, intermolecular NOE distance restraints can be obtained from 
10 a 13 C-edited (F x ) and 15 N, and 13 C-filtered (F 3 ) 3D NOESY data set collected for a 
sample containing isotope-labeled protein and non-labeled peptide. 

Structure Calculations and Refinements: Structures of the protein can be generated 
using a distance geometry/simulated annealing protocol with the X-PLOR program 

15 [see Nilges,^ al 9 FEBS Lett. 229:317-324 (1988); Kuszewski, et al 9 J. Biolmol NMR 
2:33-56 (1992); Briinger, A. T. X-PLOR Version 3.1: A system for X-Ray 
crystallography and NMR (Yale University Press, New Haven, CT, 1993)]. The 
structure calculations can employ inter-proton distance restraints obtained from 15 N- 
and 13 C-resolved NOESY spectra. The initial low-resolution structures can be used to 

20 facilitate NOE assignments, and help identify hydrogen bonding partners for slowly 
exchanging amide protons. The experimental restraints of dihedral angles and 
hydrogen bonds can be included in the distance restraints for structure refinements. 

Protein-Structure Based Design of Agonists and Antagonists 
25 of the Bromodomain-Acetvl-Lvsine Binding Complex 

Once the three-dimensional structure of the Bromodomain and the Bromodomain- 
acetyl-lysine binding complex are determined, a potential drug or agent (antagonist or 
agonist) can be examined through the use of computer modeling using a docking 
30 program such as GRAM, DOCK, or AUTODOCK [Dunbrack et al, 1997, supra]. 
This procedure can include computer fitting of potential agents to the bromodomain, 
for example, to ascertain how well the shape and the chemical structure of the potential 
ligand will complement or interfere with the interaction between the bromodomain and 
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the acetyl-lysine [Bugg et aL, Scientific American, Dec.:92-98 (1993); West et aL, 
TIPS, 16:61-1 A (1995)]. Computer programs can also be employed to estimate the 
attraction, repulsion, and steric hindrance of the agent to the dimer-dimer binding site, 
for example. Generally the tighter the fit (e.g. , the lower the steric hindrance, and/or 
5 the greater the attractive force) the more potent the potential drug will be since these 
properties are consistent with a tighter binding constant. Furthermore, the more 
specificity in the design of a potential drug the more likely that the drug will not 
interfere with related proteins. This will minimize potential side-effects due to 
unwanted interactions with other proteins. 

10 

Initially a potential drug could be obtained by screening a random peptide library 
produced by recombinant bacteriophage for example, [Scott and Smith, Science, 
249:386-390 (1990); Cwirla et aL, Proa Natl Acad. Set, 87:6378-6382 (1990); 
Devlin et aL, Science, 249:404-406 (1990)] or a chemical library. An agent selected in 

1 5 this manner could be then be systematically modified by computer modeling programs 
until one or more promising potential drugs are identified. Such analysis has been 
shown to be effective in the development of HIV protease inhibitors [Lam et aL, 
Science 263:380-384 (1994); Wlodawer et ah, Ann. Rev. Biochem. 62:543-585 (1993); 
Appelt, Perspectives in Drug Discovery and Design 1:23-48 (1993); Erickson, 

20 Perspectives in Drug Discovery and Design 1 : 1 09- 1 28 (1 993)] . 

Such computer modeling allows the selection of a finite number of rational chemical 
modifications, as opposed to the countless number of essentially random chemical 
modifications that could be made, any one of which might lead to a useful drug. Each 

25 chemical modification requires additional chemical steps, which while being 
reasonable for the synthesis of a finite number of compounds, quickly becomes 
overwhelming if all possible modifications needed to be synthesized. Thus, through 
the use of the three-dimensional structural analysis disclosed herein and computer 
modeling, a large number of these compounds can be rapidly screened on the computer 

30 monitor screen, and a few likely candidates can be determined without the laborious 
synthesis of untold numbers of compounds. 
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Once a potential drug (agonist or antagonist) is identified it can be either selected from 
a library of chemicals as are commercially available from most large chemical 
companies including Merck, Glaxo Welcome, Bristol Meyers Squib, Monsanto/Searle, 
Eli Lilly, Novartis and Pharmacia UpJohn, or alternatively the potential drug may be 
5 synthesized de novo. As mentioned above, the de novo synthesis of one or even a 
relatively small group of specific compounds is reasonable in the art of drug design. 

The potential drug can then be tested in any standard binding assay (including in high 
throughput binding assays) for its ability to bind to the ZA loop of a bromodomain. 

10 Alternatively the potential drug can be tested for its ability to modulate the binding of a 
bromodomain to acetylated histamine, for example. When a suitable potential drug is 
identified, a second NMR structural analysis can optionally be performed on the 
binding complex formed between the bromodomain-acetyl-lysine binding complex, or 
the bromodomain alone and the potential drug. Computer programs that can be used to 

15 aid in solving such three-dimensional structures include QUANTA, CHARMM, 

INSIGHT, SYBYL, MACROMODE, and ICM, MOLMOL, RASMOL, AND GRASP 
[Kraulis, J. Appl Crystallogr. 24:946-950 (1991)]. Most if not all of these programs 
and others as well can be also obtained from the Worldwide Web through the internet. 

20 Using the approach described herein and equipped with the structural analysis 

disclosed herein, the three-dimensional structures of other bromodomain-acetyl-lysine 
binding complexes can more readily be obtained and analyzed. Such analysis will, in 
turn, allow corresponding drug screening methodology to be performed using the 
three-dimensional structures of such related complexes. 

25 

For all of the drug screening assays described herein further refinements to the 
structure of the drug will generally be necessary and can be made by the successive 
iterations of any and/or all of the steps provided by the particular drug screening assay, 
including further structural analysis by NMR, for example. 

30 

Phage libraries for Drug Screening. 

Phage libraries have been constructed which when infected into host E. coli produce 
random peptide sequences of approximately 10 to 15 amino acids [Parmley and Smith, 
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Gene 73:305-318 (1988), Scott and Smith, Science 249:386-249 (1990)]. Specifically, 
the phage library can be mixed in low dilutions with permissive E. coli in low melting 
point LB agar which is then poured on top of LB agar plates. After incubating the 
plates at 37 °C for a period of time, small clear plaques in a lawn of E. coli will form 
which represents active phage growth and lysis of the E. coli. A representative of these 
phages can be absorbed to nylon filters by placing dry filters onto the agar plates. The 
filters can be marked for orientation, removed, and placed in washing solutions to 
block any remaining absorbent sites. The filters can then be placed in a solution 
containing, for example, a radioactive bromodomain. After a specified incubation 
period, the filters can be thoroughly washed and developed for autoradiography. 
Plaques containing the phage that bind to the radioactive bromodomain can then be 
identified. These phages can be further cloned and then retested for their ability to 
bind to the bromodomain as before. Once the phage has been purified, the binding 
sequence contained within the phage can be determined by standard DNA sequencing 
techniques. Once the DNA sequence is known, synthetic peptides can be generated 
which are encoded by these sequences. These peptides can be tested, for example, for 
their ability to modulate the affinity of the bromodomain for its binding partner (e.g., a 
protein comprising an acetyl-lysine or a fragment of that protein). 

The effective peptide(s) can be synthesized in large quantities for use in in vivo models 
and eventually in humans to treat certain tumors. It should be emphasized that 
synthetic peptide production is relatively non-labor intensive, easily manufactured, 
quality controlled and thus, large quantities of the desired product can be produced 
quite cheaply. Similar combinations of mass produced synthetic peptides have been 
used with great success [Patarroyo, Vaccine, 10:175-178 (1990)]. 

Drug Screening Assays 

The drug screening assays of the present invention may use any of a number of means 
for determining the interaction between an agent or drug and a peptide comprising an 
acetyl-lysine and/or a bromodomain. Thus, standard high throughput drug screening 
procedures can be employed using a library of low molecular weight compounds, for 
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example that can be screened to identify a binding partner for the bromodoamin. Any 
such chemical library can be used including those discussed above. 

In a particular assay, a bromodomain is placed on or coated onto a solid support. 
Methods for placing the peptides or proteins on the solid support are well known in the 
art and include such things as linking biotin to the protein and linking avidin to the 
solid support. An agent is allowed to equilibrate with the bromodomain to test for 
binding. Generally, the solid support is washed and agents that are retained are 
selected as potential drugs. Alternatively, a peptide comprising an acetyl-lysine is 
placed on or coated onto a solid support. In a particular embodiment of this type, the 
peptide comprises the amino acid sequence of SEQ ID NO:4. 

The agent may be labeled. For example, in one embodiment radiolabeled agents are 
used to measure the binding of the agent. In another embodiment the agents have 
fluorescent markers. In yet another embodiment, a Biocore chip (Pharmacia) coated 
with the bromodomain is used, for example and the change in surface conductivity can 
be measured. 

In addition, since a number of proteins have been identified that contain 
bromodomains, and the binding partners of many of these proteins are known, the fact 
that the bromodomain specifically binds to an acetylated lysine as disclosed herein 
allows the identification and preparation of a number of potential modulators of the 
bromodomain-acetyl-lysine binding complex based on the amino acid sequences of the 
binding partners to the proteins. Such potential modulators include : ISYGR-AcK- 
KRRQRR (SEQ ID NO:4), ARKSTGG-Ac^-APRKQL (SEQ ID NO:5) and 
QSTSRHK-^c^-LMFKTE (SEQ ID NO:6) which bind to the P/CAF bromodomain as 
shown in the Example, below. Such peptides also can be used, for example, as a 
starting point for the design of an inhibitor of the bromodomain-acetyl-lysine binding 
complex. 

Alternatively, a drug can be specifically designed to bind to the ZA loop of a 
bromodomain for example, such as the P/CAF bromodomain, and be assayed through 
NMR based methodology [Shuker et al 9 Science 274:1531-1534 (1996) hereby 
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incorporated by reference in its entirety.] In a particular embodiment, analogs of the 
binding partner of the bromodomain can be used in this analysis. One such peptide has 
the amino acid sequence of SEQ ID NO:4. In another embodiment of this type, the 
peptide has the amino acid sequence of SEQ ID NO:5. In another such embodiment of 
5 this type, the peptide has the amino acid sequence of SEQ ID NO:6. 

The assay begins with contacting a compound with a 15 N-labeled bromodomain. 
Binding of the compound with the ZA loop of the bromodomain can be determined by 
monitoring the 15 N- or l H-amide chemical shift changes in two dimensional 15 N- 

10 heteronuclear single-quantum correlation ( 15 N-HSQC) spectra upon the addition of the 
compound to the 15 N-labeled bromodomain. Since these spectra can be rapidly 
obtained, it is feasible to screen a large number of compounds [Shuker et aL, Science 
274:1531-1534 (1996)]. A compound is identified as a potential ligand if it binds to 
the ZA loop of the bromodomain. In a further embodiment, the potential ligand can 

15 then be used as a model structure, and analogs to the compound can be obtained (e.g, 
from the vast chemical libraries commercially available, or alternatively through de 
novo synthesis). The analogs are then screened for their ability to bind the ZA loop of 
the bromodomain thus to obtain a ligand. An analog of the potential ligand is chosen 
as a ligand when it binds to the ZA loop of the bromodomain with a higher binding 

20 affinity than the potential ligand. In a preferred embodiment of this type the analogs 
are screened by monitoring the 15 N- or ^-amide chemical shift changes in two 
dimensional 15 N-heteronuclear single-quantum correlation ( 15 N-HSQC) spectra upon 
the addition of the analog to the 15 N-labeled bromodomain as described above. 

25 In another further embodiment, compounds are screened for binding to two nearby 
sites on the bromodomain. In this case, a compound that binds a first site of the 
bromodomain does not bind a second nearby site. Binding to the second site can be 
determined by monitoring changes in a different set of amide chemical shifts in either 
the original screen or a second screen conducted in the presence of a ligand (or 

30 potential ligand) for the first site. From an analysis of the chemical shift changes the 
approximate location of a potential ligand for the second site is identified. 
Optimization of the second ligand for binding to the site is then carried out by 
screening structurally related compounds {e.g., analogs as described above). When 



40 

ligands for the first site and the second site are identified, their location and orientation 
in the ternary complex can be determined experimentally either by NMR spectroscopy 
or X-ray crystallography. On the basis of this structural information, a linked 
compound is synthesized in which the ligand for the first site and the ligand for the 
second site are linked. In a preferred embodiment of this type the two ligands are 
covalently linked. This linked compound is tested to determine if it has a higher 
binding affinity for the bromodomain than either of the two individual ligands. A 
linked compound is selected as a ligand when it has a higher binding affinity for the 
bromodomain than either of the two ligands. In a preferred embodiment the affinity of 
the linked compound with the bromodomain is determined monitoring the 15 N- or 1 H- 
amide chemical shift changes in two dimensional 15 N-heteronuclear single-quantum 
correlation ( 15 N-HSQC) spectra upon the addition of the linked compound to the 15 N- 
labeled bromodomain as described above. 

A larger linked compound can be constructed in an analogous manner, e.g., linking 
three ligands which bind to three nearby sites on the bromodomain to form a 
multilinked compound that has an even higher affinity for the bromodomain than the 
linked compound. 

Identification of New Bromodomains 

By disclosing that protein bound acetyl-lysine is a binding partner for bromodomains, 
the present invention provides a method of identifying novel proteins that contain 
bromodomains. In short, a protein fragment or analog thereof comprising an acetyl- 
lysine can be used as bait to identify a binding partner that comprises a bromodomain. 
Any one of a number of procedures can be carried out to identify such a binding 
partner. One such assay comprises passing a cell extract over the bait peptide which is 
attached to a solid support.. After washing the solid support to remove any non- 
specific binders, the bromodomain containing protein can be eluted from the solid 
support with an appropriate eluant. In a particular embodiment, the free bait peptide 
can be used in the elution. Other methodology includes the use of a yeast two-hybrid 
system, a GST pull down assay, ELISA, immunometric assays, and a modification of 
the CORT procedure of Schlessinger et al 9 (US Patent No. 5,858,686, Issued on 
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January 12, 1999 which is hereby incorporated by reference in its entirety) for use with 
the bromodomain-acetyl-lysine binding complex. 



Labels : 

5 

Suitable labels include enzymes, fluorophores (e.g., fluorescein isothiocyanate (FITC) ? 
phycoerythrin (PE), Texas red (TR), rhodamine, free or chelated lanthanide series salts, 
especially Eu 3+ , to name a few fluorophores), chromophores, radioisotopes, chelating 
agents, dyes, colloidal gold, latex particles, ligands (e.g., biotin), and 
10 chemiluminescent agents. When a control marker is employed, the same or different 
labels may be used for the test and control marker gene. 

In the instance where a radioactive label, such as the isotopes 3 H, 14 C, 32 P, 35 S, 36 C1, 
51 Cr, 57 Co, 58 Co, 59 Fe, 90 Y, 125 I, 131 I, and 186 Re are used, known currently available 
1 5 counting procedures may be utilized. In the instance where the label is an enzyme, 
detection may be accomplished by any of the presently utilized colorimetric, 
spectrophotometric, fluorospectrophotometric, amperometric or gasometric techniques 
known in the art, 

20 Direct labels are one example of labels which can be used according to the present 
invention. A direct label has been defined as an entity, which in its natural state, is 
readily visible, either to the naked eye, or with the aid of an optical filter and/or applied 
stimulation, e.g. U.V. light to promote fluorescence. Among examples of colored 
labels, which can be used according to the present invention, include metallic sol 

25 particles, for example, gold sol particles such as those described by Leuvering (U.S. 
Patent 4,313,734); dye sole particles such as described by Gribnau et al (U.S. Patent 
4,373,932 and May et al (WO 88/08534); dyed latex such as described by May, supra, 
Snyder (EP-A 0 280 559 and 0 281 327); or dyes encapsulated in liposomes as 
described by Campbell et al. (U.S. Patent 4,703,017). Other direct labels include a 

30 radionucleotide, a fluorescent moiety or a luminescent moiety. In addition to these 
direct labeling devices, indirect labels comprising enzymes can also be used according 
to the present invention. Various types of enzyme linked immunoassays are well 
known in the art, for example, alkaline phosphatase and horseradish peroxidase, 
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lysozyme, glucose-6-phosphate dehydrogenase, lactate dehydrogenase, urease, these 
and others have been discussed in detail by Eva Engvall in Enzyme Immunoassay 
ELISA and EMIT in Methods inEnzymology, 70:419-439 (1980) and inU.S. Patent 
4,857,453. 

5 

Suitable enzymes include, but are not limited to, alkaline phosphatase, p-galactosidase, 
green fluorescent protein and its derivatives, luciferase, and horseradish peroxidase. 

Other labels for use in the invention include magnetic beads or magnetic resonance 
10 imaging labels. 

Antibodies to Portions of the Bromodomain that Interact w ith Acetvl-Lvsine 

According to the present invention, the bromodomains, and more particularly the ZA 
1 5 loops of the bromodomains and fragments thereof can be produced by a recombinant 
source, or through chemical synthesis, or through the modification of these peptides 
and fragments; and derivatives or analogs thereof, including fusion proteins, may be 
used as an immunogen to generate antibodies that specifically interfere with the 
formation of the bromodomain-acetyl-lysine binding complex. Similarly, antibodies 
20 can be raised against peptides that comprise one or more acetyl-lysine residues which 
also interfere with the formation of the bromodomain-acetyl-lysine binding complex. 
Such antibodies include but are not limited to polyclonal, monoclonal, chimeric, single 
chain, Fab fragments, and a Fab expression library. 

25 Various procedures known in the art may be used for the production of the polyclonal 
antibodies. For the production of antibody, various host animals can be immunized by 
injection with the peptide having the amino acid sequence of SEQ ID NO:3, for 
example, or a derivative {e.g., or fusion protein) thereof, including but not limited to 
rabbits, mice, rats, sheep, goats, etc. In one embodiment, the peptide can be 

30 conjugated to an immunogenic carrier, e.g., bovine serum albumin (BSA) or keyhole 
limpet hemocyanin (KLH). Various adjuvants may be used to increase the 
immunological response, depending on the host species, including but not limited to 
Freund's (complete and incomplete), mineral gels such as aluminum hydroxide, surface 
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active substances such as lysolecithin, pluronic polyols, polyanions, peptides, oil 
emulsions, keyhole limpet hemocyanins, dinitrophenol, and potentially useful human 
adjuvants such as BCG (bacille Calmette-Guerin) and Corynebacterium parvum. 

5 For preparation of monoclonal antibodies directed toward the peptides or protein 
fragments of the present invention, or analog, or derivative thereof, any technique that 
provides for the production of antibody molecules by continuous cell lines in culture 
may be used. These include but are not limited to the hybridoma technique originally 
developed by Kohler and Milstein [Nature, 256:495-497 (1975)], as well as the trioma 

10 technique, the human B-cell hybridoma technique [Kozbor et ah, Immunology Today, 
4:72 (1983); Cote etah, Proc. Natl Acad. Set U.S.A., 80:2026-2030 (1983)], and the 
EBV-hybridoma technique to produce human monoclonal antibodies [Cole et ah, in 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96 (1985)], In 
an additional embodiment of the invention, monoclonal antibodies can be produced in 

15 germ-free animals utilizing technology described in PCT/US90/02545. In fact, 
according to the invention, techniques developed for the production of "chimeric 
antibodies" [Morrison et ah, J. Bacterioh, 159:870 (1984); Neuberger et ah, Nature, 
312:604-608 (1984); Takeda et ah, Nature, 314:452-454 (1985)] by splicing the genes 
from a mouse antibody molecule specific for the peptide having the amino acid 

20 sequence of SEQ ID NO:3, for example, together with genes from a human antibody 
molecule of appropriate biological activity can be used; such antibodies are within the 
scope of this invention. Such human or humanized chimeric antibodies are preferred 
for use in therapy of human diseases or disorders (described infra), since the human or 
humanized antibodies are much less likely than xenogenic antibodies to induce an 

25 immune response, in particular an allergic response, themselves. 

According to the invention, techniques described for the production of single chain 
antibodies [U.S. Patent Nos. 5,476,786 and 5,132,405 to Huston; U.S. Patent 
4,946,778] can be adapted to produce specific single chain antibodies. An additional 
30 embodiment of the invention utilizes the techniques described for the construction of 
Fab expression libraries [Huse et ah, Science, 246:1275-1281 (1989)] to allow rapid 
and easy identification of monoclonal Fab fragments with the desired specificity. 
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Antibody fragments which contain the idiotype of the antibody molecule can be 
generated by known techniques. For example, such fragments include but are not 
limited to: the F(ab') 2 fragment which can be produced by pepsin digestion of the 
antibody molecule; the Fab' fragments which can be generated by reducing the 
5 disulfide bridges of the F(ab ') 2 fragment, and the Fab fragments which can be 
generated by treating the antibody molecule with papain and a reducing agent. 

In the production of antibodies, screening for the desired antibody can be accomplished 
by techniques known in the art, e.g., radioimmunoassay, ELISA (enzyme-linked 

10 immunosorbant assay), "sandwich" immunoassays, immunoradiometric assays, gel 
diffusion precipitin reactions, immunodiffusion assays, in situ immunoassays (using 
colloidal gold, enzyme or radioisotope labels, for example), western blots, precipitation 
reactions, agglutination assays {e.g., gel agglutination assays, hemagglutination 
assays), complement fixation assays, immunofluorescence assays, protein A assays, 

1 5 and Immunoelectrophoresis assays, etc. In one embodiment, antibody binding is 
detected by detecting a label on the primary antibody. In another embodiment, the 
primary antibody is detected by detecting binding of a secondary antibody or reagent to 
the primary antibody. In a further embodiment, the secondary antibody is labeled. 
Many means are known in the art for detecting binding in an immunoassay and are 

20 within the scope of the present invention. For example, to select antibodies which 
recognize a specific epitope of a ZA loop of a bromodomain, for example, one may 
assay generated hybridomas for a product which binds to a bromodomain fragment 
containing such an epitope and choose those which do not cross-react with 
bromodomain fragments that do not include that epitope, 

25 

In a specific embodiment, antibodies that interfere with the formation of the 
bromodomain-acetyl-lysine complex can be generated. Such antibodies can be tested 
using the assays described and could potentially be used in anti-cancer therapies. 

30 Administration 



According to the invention, the component or components of a therapeutic 
composition, e.g., an agent of the invention that interferes with the bromodomain- 
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acetyl-lysine binding complex such as the peptide having the amino acid sequence of 
SEQ ID NOs:4, 5, or 6 and a pharmaceutical^ acceptable carrier, may be introduced 
parenterally, transmucosally, e.g., orally, nasally, or rectally, or transdermally. 
Preferably, administration is parenteral, e.g., via intravenous injection, and also 
5 including, but is not limited to, intra-arteriole, intramuscular, intradermal, 
subcutaneous, intraperitoneal, intraventricular, and intracranial administration. 

In a preferred aspect, the agent of the present invention can cross cellular and nuclear 
membranes, which would allow for intravenous or oral administration. Strategies are 
10 available for such crossing, including but not limited to, increasing the hydrophobic 
nature of a molecule; introducing the molecule as a conjugate to a carrier, such as a 
ligand to a specific receptor, targeted to a receptor; and the like. 

The present invention also provides for conjugating targeting molecules to such an 
15 agent. "Targeting molecule" as used herein shall mean a molecule which, when 
administered in vivo, localizes to desired location(s). In various embodiments, the 
targeting molecule can be a peptide or protein, antibody, lectin, carbohydrate, or 
steroid. In one embodiment, the targeting molecule is a peptide ligand of a receptor on 
the target cell. In a specific embodiment, the targeting molecule is an antibody, 
20 Preferably, the targeting molecule is a monoclonal antibody. In one embodiment, to 
facilitate crosslinking the antibody can be reduced to two heavy and light chain 
heterodimers, or the F(ab') 2 fragment can be reduced, and crosslinked to the agent via 
the reduced sulfhydryl. Antibodies for use as targeting molecule are specific for a cell 
surface antigen. 

25 

In another embodiment, the therapeutic compound can be delivered in a vesicle, in 
particular a liposome [see Langer, Science, 249:1527-1533 (1990); Treat et al, in 
Liposomes in the Therapy of Infectious Disease and Cancer, Lopez-Berestein and 
Fidler (eds.), Liss: New York, pp. 353-365 (1989); Lopez-Berestein, ibid., pp. 317- 
30 327; see generally ibid.]. 

In yet another embodiment, the therapeutic compound can be delivered in a controlled 
release system. For example, the agent may be administered using intravenous 
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infusion, an implantable osmotic pump, a transdermal patch, liposomes, or other 
modes of administration. In one embodiment, a pump may be used [see Langer, supra; 
Sefton, CRC Crit. Ref Biomed. Eng., 14:201 (1987); Buchwald et al, Surgery, 88:507 
(1980); Saudek et al, N. Engl J. Med., 321:574 (1989)]. In another embodiment, 
5 polymeric materials can be used [see Medical Applications of Controlled Release, 
Langer and Wise (eds.), CRC Press: Boca Raton, Florida (1974); Controlled Drug 
Bioavailability, Drug Product Design and Performance, Smolen and Ball (eds.), 
Wiley: New York (1984); Ranger and Peppas, J. Macromol Sci. Rev. Macromol 
Chem., 23:61 (1983); see also Levy et al, Science, 228:190 (1985); During et al.,Ann. 

10 Neurol, 25:351 (1989); Howard et al, J. Neurosurg, 71:105 (1989)]. In yet another 
embodiment, a controlled release system can be placed in proximity of the therapeutic 
target, i.e., the bone marrow, thus requiring only a fraction of the systemic dose [see, 
e.g., Goodson, in Medical Applications of Controlled Release, supra, vol. 2, pp. 115- 
138 (1984)]. Other controlled release systems are discussed in the review by Langer 

15 [Science, 249:1527-1533 (1990)]. 

Pharmaceutical Compositions. In yet another aspect of the present invention, provided 
are pharmaceutical compositions of the above. Such pharmaceutical compositions may 
be for administration for injection, or for oral, pulmonary, nasal or other forms of 

20 administration. In general, comprehended by the invention are pharmaceutical 

compositions comprising effective amounts of a low molecular weight component or 
components, or derivative products, of the invention together with pharmaceutically 
acceptable diluents, preservatives, solubilizers, emulsifiers, adjuvants and/or carriers. 
Such compositions include diluents of various buffer content {e.g., Tris-HCl, acetate, 

25 phosphate), pH and ionic strength; additives such as detergents and solubilizing agents 
{e.g., Tween 80, Polysorbate 80), anti-oxidants {e.g., ascorbic acid, sodium 
metabisulfite), preservatives {e.g., Thimersol, benzyl alcohol) and bulking substances 
{e.g., lactose, mannitol); incorporation of the material into particulate preparations of 
polymeric compounds such as polyiactic acid, polyglycolic acid, etc. or into liposomes. 

30 Hylauronic acid may also be used. Such compositions may influence the physical 
state, stability, rate of in vivo release, and rate of in vivo clearance of the present 
proteins and derivatives. See, e.g., Remington's Pharmaceutical Sciences, 18th Ed. 
[1990, Mack Publishing Co., Easton, PA 18042] pages 1435-1712 which are herein 
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incorporated by reference. The compositions may be prepared in liquid form, or may 
be in dried powder, such as lyophilized form. 

Oral Delivery. Contemplated for use herein are oral solid dosage forms, which are 
5 described generally in Remington's Pharmaceutical Sciences, 18th Ed. 1990 (Mack 
Publishing Co. Easton PA 18042) at Chapter 89, which is herein incorporated by 
reference. Solid dosage forms include tablets, capsules, pills, troches or lozenges, 
cachets or pellets. Also, liposomal or proteinoid encapsulation may be used to 
formulate the present compositions (as, for example, proteinoid microspheres reported 

10 in U.S. Patent No. 4,925,673). Liposomal encapsulation may be used and the 

liposomes may be derivatized with various polymers {e.g., U.S. Patent No. 5,013,556). 
A description of possible solid dosage forms for the therapeutic is given by Marshall, 
K. In: Modern Pharmaceutics Edited by G.S. Banker and C.T. Rhodes Chapter 10, 
1979, herein incorporated by reference. In general, the formulation will include an 

1 5 agent of the present invention (or chemically modified forms thereof) and inert 

ingredients which allow for protection against the stomach environment, and release of 
the biologically active material in the intestine. 

Also specifically contemplated are oral dosage forms of the above derivatized 
20 component or components. The component or components may be chemically 

modified so that oral delivery of the derivative is efficacious. Generally, the chemical 
modification contemplated is the attachment of at least one moiety to the component 
molecule itself, where said moiety permits (a) inhibition of proteolysis; and (b) uptake 
into the blood stream from the stomach or intestine. Also desired is the increase in 
25 overall stability of the component or components and increase in circulation time in the 
body. An example of such a moiety is polyethylene glycol. 

For the component (or derivative) the location of release may be the stomach, the small 
intestine (the duodenum, the jejunum, or the ileum), or the large intestine. One skilled 
30 in the art has available formulations which will not dissolve in the stomach, yet will 
release the material in the duodenum or elsewhere in the intestine. Preferably, the 
release will avoid the deleterious effects of the stomach environment, either by 
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protection of the protein (or derivative) or by release of the biologically active material 
beyond the stomach environment, such as in the intestine. 

The therapeutic can be included in the formulation as fine multi-particulates in the 
5 form of granules or pellets of particle size about 1 mm. The formulation of the 
material for capsule administration could also be as a powder, lightly compressed 
plugs or even as tablets. The therapeutic could be prepared by compression. 

One may dilute or increase the volume of the therapeutic with an inert material. These 
10 diluents could include carbohydrates, especially mannitol, a-lactose, anhydrous lactose, 
cellulose, sucrose, modified dextrans and starch. Certain inorganic salts may be also 
be used as fillers including calcium triphosphate, magnesium carbonate and sodium 
chloride. Some commercially available diluents are Fast-Flo, Emdex, STA-Rx 1500, 
Emcompress and AvicelL 

15 

Disintegrants may be included in the formulation of the therapeutic into a solid dosage 
form. Materials used as disintegrates include but are not limited to starch, including 
the commercial disintegrant based on starch, Explotab. Binders also may be used to 
hold the therapeutic agent together to form a hard tablet and include materials from 
20 natural products such as acacia, tragacanth, starch and gelatin. 

An anti-frictional agent may be included in the formulation of the therapeutic to 
prevent sticking during the formulation process. Lubricants may be used as a layer 
between the therapeutic and the die wall. Glidants that might improve the flow 
25 properties of the drug during formulation and to aid rearrangement during compression 
also might be added. The glidants may include starch, talc, pyrogenic silica and 
hydrated silicoaluminate. 

In addition, to aid dissolution of the therapeutic into the aqueous environment a 
30 surfactant might be added as a wetting agent. Additives which potentially enhance 
uptake of the protein (or derivative) are for instance the fatty acids oleic acid, linoleic 
acid and linolenic acid. 
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Nasal Delivery. Nasal delivery of an agent of the present invention (or derivative) is 
also contemplated. Nasal delivery allows the passage of a peptide, for example, to the 
blood stream directly after administering the therapeutic product to the nose, without 
the necessity for deposition of the product in the lung. Formulations for nasal delivery 
5 include those with dextran or cyclodextran. 

Transdermal administration. Various and numerous methods are known in the art for 
transdermal administration of a drug, e.g., via a transdermal patch. Transdermal 
patches are described in for example, U.S. Patent No. 5,407 r ,7 '13, issued April 18, 1995 

10 to Rolando et al ; U.S. Patent No. 5,352,456, issued October 4, 1004 to Fallon et al ; 
U.S. Patent No. 5,332,213 issued August 9, 1994 to D'Angelo et a/.; U.S. Patent No. 
5,336,168, issued August 9, 1994 to Sibalis; U.S. Patent No. 5,290,561, issued March 
1, 1994 to Farhadieh et al; U.S. Patent No. 5,254,346, issued October 19, 1993 to 
Tucker et al; U.S. Patent No. 5,164,189, issued November 17, 1992 to Berger et al; 

15 U.S. Patent No. 5,163,899, issued November 17, 1992 to Sibalis; U.S. Patent Nos. 
5,088,977 and 5,087,240, both issued February 18, 1992 to Sibalis; U.S. Patent No. 
5,008,110, issued April 16, 1991 to Benecke et al; and U.S. Patent No. 4,921,475, 
issued May 1, 1990 to Sibalis, the disclosure of each of which is incorporated herein 
by reference in its entirety. 

20 

It can be readily appreciated that a transdermal route of administration may be 
enhanced by use of a dermal penetration enhancer, e.g., such as enhancers described in 
U.S. Patent No. 5,164,189 (supra), U.S. Patent No. 5,008,110 (supra), and U.S. Patent 
No. 4,879,1 19, issued November 7, 1989 to Aruga et al, the disclosure of each of 
25 which is incorporated herein by reference in its entirety. 

Pulmonary Delivery. Also contemplated herein is pulmonary delivery of the 
pharmaceutical compositions of the present invention. A pharmaceutical composition 
of the present invention is delivered to the lungs of a mammal while inhaling and 
30 traverses across the lung epithelial lining to the blood stream. Other reports of this 
include Adjei et al [Pharmaceutical Research, 7:565-569 (1990); Adjei et al., 
InternationalJournal of Pharmaceutics, 63:135-144 (1990) (leuprolide acetate); 
Braquet et al, Journal of Cardiovascular Pharmacology, 13(suppL 5): 143-146 (1989) 
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(endothelin-1); Hubbard et ah, Annals of Internal Medicine, Vol. Ill, pp. 206-212 
(1989) (al -antitrypsin); Smiths a/., J. Clin, Invest, 84:1145-1146 (1989) (a-1- 
proteinase); Oswein et ah, "Aerosolization of Proteins", Proceedings of Symposium on 
Respiratory Drug Delivery II, Keystone, Colorado, March, (1990) (recombinant human 
5 growth hormone); Debs et al., J. Immunol., 140:3482-3488 (1988) (interferon- y and 
tumor necrosis factor alpha); Platz et ah, U.S. Patent No. 5,284,656 (granulocyte 
colony stimulating factor)]. A method and composition for pulmonary delivery of 
drugs for systemic effect is described in U.S. Patent No. 5,451,569, issued September 
19, 1995 to Wong et ah 

10 

A subject in whom administration of an agent of the present invention is an effective 
therapeutic regiment for cancer, for example, is preferably a human, but can be any 
animal. Thus, as can be readily appreciated by one of ordinary skill in the art, the 
methods and pharmaceutical compositions of the present invention are particularly 

15 suited to administration to any animal, e.g., for veterinary medical use, particularly for 
a mammal, and including, but by no means limited to, domestic animals, such as feline 
or canine subjects, farm animals, including bovine, equine, caprine, ovine, and porcine 
subjects, wild animals (whether in the wild or in a zoological garden), research 
animals, such as mice, rats, rabbits, goats, sheep, pigs, dogs, cats, avian species, such 

20 as chickens, turkeys, and songbirds. 

The present invention may be better understood by reference to the following non- 
limiting Example, which is provided as exemplary of the invention. The following 
example is presented in order to more fully illustrate the preferred embodiments of the 
25 invention. It should in no way be construed, however, as limiting the broad scope of 
the invention. 



30 
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EXAMPLE 

STRUCTURE AND LIGAND OF A HISTONE 
ACETYLTRANSFERASE BRQMODOMAIN 

5 Introduction 

The bromodomain is a protein motif comprising approximately 110 amino acids that is 
found in practically all nuclear histone acetyltransferases (HATs) [Jeanmougin et al. 9 
Trends in Biochemical Sciences, 22:151-153 (1997)]. However, despite the seemingly 
requisite occurrence of this motif in HATs, their role in these enzymes is unknown. 
10 Indeed, although this motif has also been identified in other chromatin proteins, 
heretofore not even one binding partner for a bromodomain had been identified. 

Materials and Methods 
Sample preparation: The bromodomain of P/CAF (residues 719-832 of SEQ ID NO:2) 

15 was subcloned into the pET14b expression vector (Novagen) and expressed in 

Escherichia coli BL21(DE3) cells. Uniformly 15 N- and 15 N/ 13 C-labelled proteins were 
prepared by growing bacteria in a minimal medium containing 15 NH 4 C1 with or 
without 13 C 6 -glucose. A uniformly 15 N/ 13 C-labelled and fractionally deuterated protein 
sample was prepared by growing the cells in 75% 2 H 2 0. The bromodomain was 

20 purified by affinity chromatography on a nickel-IDA column (Invitrogen) followed by 
the removal of poly-His tag by thrombin cleavage. The final purification of the protein 
was achieved by size-exclusion chromatography. The acetyl-lysine-containing 
peptides were prepared on a MilliGen 9050 peptide synthesizer (Perkin Elmer) using 
Fmoc/HBTU chemistry. Acetyl-lysine was incorporated using the reagent 

25 Fmoc-Ac-Lys with HBTU/DIPEA activation. NMR samples contained approximately 
1 mM protein in lOOmM phosphate buffer of pH 6.5 and 5mM perdeuterated DTT and 
0.5mM EDTA in H 2 0/ 2 H 2 0 (9/1) or 2 H 2 0. 

NMR spectroscopy: All NMR spectra were acquired at 30 °C on a Bruker DRX600 or 
30 DRX500 spectrometer. The backbone assignments of the 3 H, 13 C, and 15 N resonances 
were achieved using deuterium-decoupled triple-resonance experiments of HNCACB 
and HN(CO)CACB [Yamazaki etal. 9 J. Am. Chem. Soc. 116:11655-11666 (1994)] 
recorded using the uniformly 15 N/ 13 C-labeled and fractionally deuterated protein. The 
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side-chain atoms were assigned from 3D HCCH-TOCSY [Clore and Gronenborn, 
Meth. Enzymol. 239:249-363 (1994)] and (H)C(CO)NH-TOCSY [Logan et al, J. 
Biolmol. NMR 3:225-231 (1993)] data collected on the uniformly 15 N/ 13 C-labeled 
protein. Stereospecific assignments of methyl groups of the Val and Leu residues were 
5 obtained using a fractionally 13 C-labeled sample [Neri et al, Biochemistry 28:7510- 
7516 (1989)]. The NOE-derived distance restraints were obtained from 15 N- or 
13 C-edited 3D NOESY spectra, jangle restraints were determined based on the 
Vhnh« coupling constants measured in a 3D HNHA spectrum [Clore and Gronenborn, 
Meth. Enzymol. 239:249-363 (1994)]. Slowly exchanging amide protons were 
10 identified from a series of 2D 15 N-HSQC spectra recorded after the H 2 0 buffer was 
changed to a 2 H 2 0 buffer. The intermolecular NOEs used in defining the structure of 

the bromodomain/Ac-histamine complex were detected in 13 C-edited (F,), 

13 C/ 15 N-filtered (F 5 ) 3D NOESY spectrum [Clore and Gronenborn, Meth. Enzymol. 

239:249-363 (1994)]. All NMR spectra were processed with the NMRPipe/NMRDraw 
1 5 programs and analyzed using NMRView [Johnson and Blevins, J. Biomol, NMR 

4:603-614 (1994)]. 

Structure calculations: Structures of the bromodomain were calculated with a distance 
geometry/simulated annealing protocol using the X-PLOR program [Brunger, A. X- 

20 PLOR Version 3.1: A system for X-Ray crystallography and NMR, Yale University 
Press, New Haven, CT, (1993)]. A total of 1324 manually assigned NOE-derived 
distance restraints were obtained from the I5 N- and 13 C-edited NOE spectra. Further 
analysis of the NOE spectra was carried out by the iterative automated assignment 
procedure using ARIA [Nilges and O'Donoghue, Prog. NMR Spectroscopy 32:107-139 

25 (1998)], which integrates with X-PLOR for structure calculations. A total of 1519 
unambiguous and 590 ambiguous distance restraints were identified from the NOE 
data by ARIA, many of which were checked and confirmed manually. The 
ARIA-assigned distance restraints were in agreement with the structures calculated 
using only the manually assigned NOE distance restraints, 28 hydrogen-bond distance 

30 restraints for 14 hydrogen bonds, and 54 jangle restraints. The final structure 

calculations employed a total of 3515 NMR experimental restraints obtained from the 
manual and the ARIA-assisted assignments, 2843 of which were unambiguously 
assigned NOE-derived distance restraints that comprise of 1077 intra-residue, 621 
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sequential, 550 medium-range, and 595 long-range NOEs. For the ensemble of the 
final 30 structures, no distance and torsional angle restraints were violated by more 
than 0.3 A and 5 °, respectively. The total, distance violation, and dihedral violation 
energies were 178.7 ± 2.4 kcal mol 1 , 41.6 ± 0.9 kcal mol" 1 , and 0.50 ± 0.06 kcal mol" 1 , 
5 respectively. The Lennard- Jones potential which was not used during any refinement 
stage, was -526.2 ± 16.8 kcal mol" 1 for the final structures. Ramachandran plot analysis 
of the final structures (residues 727-828) with Procheck-NMR [Laskowski et aL, J. 
Biolmol NMR 8:477-486 (1996)] showed that 71.0 ± 0.6%, 23.8 db 0.6% 3.5 =b 0.2%, 
and 1.7 ± 0.2% of the non-Gly and non-Pro residues were in the most favorable, 

10 additionally allowed, generously allowed, and disallowed regions, respectively. The 
corresponding values for the residues in the four a-helices (residues 727-743, 770-776, 
785-802, and 807-827) were 88.9 ± 0.4%, 1 1.0 ± 0.4%, 0.1 ± 0.1%, and 0.0 ± 0.0%, 
respectively. The structure of the bromodomain/acetyl-histamine complex was 
determined using the free form structure and additional 25 intermolecular and 5 

1 5 intra-ligand NOE-derived distance restraints . 



Site-directed mutagenesis: Mutant proteins were prepared using the QuickChange 
site-directed mutagenesis kit (Stratagene). The presence of appropriate mutations was 
confirmed by DNA sequencing. 

20 

Ligand titration: Ligand titration experiments were performed by recording a series of 
2D 15 N- and 13 OHSQC spectra on the uniformly 15 N-, and 15 N/ 13 C-labelled 
bromodomain (~0.3mM), respectively, in the presence of different amounts of ligand 
concentration ranging from 0 to approximately 2.0 mM. The protein sample and the 
25 stock solutions of the ligands were all prepared in the same aqueous buffer containing 
lOOmM phosphate and 5mM perdeuterated DTT at pH 6.5. 



The full length nucleic acid sequence of the human p300/CBP-associated factor 
(P/CAF) was obtained from GenBank. Accession No: U57317.2 (SEQ ID NO:l) : 

30 l ggggccgcgt cgacgcggaa aagaggccgt ggggggcctc ccagcgctgg cagacaccgt 

61 gaggctggca gccgccggca cgcacaccta gfcccgcagtc ccgaggaaca tgtccgcagc 
121 cagggcgcgg agcagagtcc cgggcaggag aaccaaggga gggcgtgtgc tgtggcggcg 
181 gcggcagcgg cagcggagcc gctagtcccc tccctcctgg gggagcagct gccgccgctg 
241 ccgccgccgc caccaccatc agcgcgcggg gcccggccag agcgagccgg gcgagcggcg 
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3 01 cgctaggggg agggcggggg cggggagggg ggtgggcgaa gggggcggga gggcgtgggg 
361 ggagggtctc gctctcccga ctaccagagc ccgagggaga ccctggcggc ggcggcggcg 
421 cctgacactc ggcgcctcct gccgtgctcc ggggcggcat gtccgaggct ggcggggccg 
481 ggccgggcgg ctgcggggca ggagccgggg caggggccgg gcccggggcg ctgcccccgc 
541 agcctgcggc gcttccgccc gcgcccccgc agggctcccc ctgcgccgct gccgccgggg 
601 gctcgggcgc ctgcggtccg gcgacggcag tggctgcagc gggcacggcc gaaggaccgg 
661 gaggcggtgg ctcggcccga atcgccgtga agaaagcgca actacgctcc gctccgcggg 
721 ccaagaaact ggagaaactc ggagtgtact ccgcctgcaa ggccgaggag tcttgtaaat 
781 gtaatggctg gaaaaaccct aacccctcac ccactccccc cagagccgac ctgcagcaaa 
841 taattgtcag tctaacagaa tcctgtcgga gttgtagcca tgccctagct gctcatgttt 
901 cccacctgga gaatgtgtca gaggaagaaa tgaacagact cctgggaata gtattggatg 
961 tggaatatct ctttacctgt gtccacaagg aagaagatgc agataccaaa caagtttatt 
1021 tctatctatt taagctcttg agaaagtcta ttttacaaag aggaaaacct gtggttgaag 
1081 gctctttgga aaagaaaccc ccatttgaaa aacctagcat tgaacagggt gtgaataact 
1141 ttgtgcagta caaatttagt cacctgccag caaaagaaag gcaaacaata gttgagttgg 
12 01 caaaaatgtt cctaaaccgc atcaactatt ggcatctgga ggcaccatct caacgaagac 
1261 tgcgatctcc caatgatgat atttctggat acaaagagaa ctacacaagg tggctgtgtt 
1321 actgcaacgt gccacagttc tgcgacagtc tacctcggta cgaaaccaca caggtgtttg 
1381 ggagaacatt gcttcgctcg gtcttcactg ttatgaggcg acaactcctg gaacaagcaa 
1441 gacaggaaaa agataaactg cctcttgaaa aacgaactct aatcctcact catttcccaa 
1501 aatttctgtc catgctagaa gaagaagtat atagtcaaaa ctctcccatc tgggatcagg 
1561 attttctctc agcctcttcc agaaccagcc agctaggcat ccaaacagtt atcaatccac 
1621 ctcctgtggc tgggacaatt tcatacaatt caacctcatc ttcccttgag cagccaaacg 
1681 cagggagcag cagtcctgcc tgcaaagcct cttctggact tgaggcaaac ccaggagaaa 
1741 agaggaaaat gactgattct catgttctgg aggaggccaa gaaaccccga gttatggggg 
1801 atattccgat ggaattaatc aacgaggtta tgtctaccat cacggaccct gcagcaatgc 
1861 ttggaccaga gaccaatttt ctgtcagcac actcggccag ggatgaggcg gcaaggttgg 
1921 aagagcgcag gggtgtaatt gaatttcacg tggttggcaa ttccctcaac cagaaaccaa 
1981 acaagaagat cctgatgtgg ctggttggcc tacagaacgt tttctcccac cagctgcccc 
2041 gaatgccaaa agaatacatc acacggctcg tctttgaccc gaaacacaaa acccttgctt 
2101 taattaaaga tggccgtgtt attggtggta tctgtttccg tatgttccca tctcaaggat 
2161 tcacagagat tgtcttctgt gctgtaacct caaatgagca agtcaagggc tatggaacac 
2221 acctgatgaa tcatttgaaa gaatatcaca taaagcatga catcctgaac ttcctcacat 
2281 atgcagatga atatgcaatt ggatacttta agaaacaggg tttctccaaa gaaattaaaa 
2341 tacctaaaac caaatatgtt ggctatatca aggattatga aggagccact ttaatgggat 
24 01 gtgagctaaa tccacggatc ccgtacacag aattttctgt catcattaaa aagcagaagg 
2461 agataattaa aaaactgatt gaaagaaaac aggcacaaat tcgaaaagtt taccctggac 
2521 tttcatgttt taaagatgga gttcgacaga ttcctataga aagcattcct ggaattagag 
2581 agacaggctg gaaaccgagt ggaaaagaga aaagtaaaga gcccagagac cctgaccagc 
2641 tttacagcac gctcaagagc atcctccagc aggtgaagag ccatcaaagc gcttggccct 
2701 tcatggaacc tgtgaagaga acagaagctc caggatatta tgaagttata aggttcccca 
2761 tggatctgaa aaccatgagt gaacgcctca agaataggta ctacgtgtct aagaaattat 
2821 tcatggcaga cttacagcga gtctttacca attgcaaaga gtacaacgcc gctgagagtg 
2881 aatactacaa atgtgccaat atcctggaga aattcttctt cagtaaaatt aaggaagctg 
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2941 gattaattga caagtgattt tttttccccc tctgcttctt agaaactcac caagcagtgt 
3001 gcctaaagca aggt 

The full length protein sequence of the human p300/CBP-associated factor (P/CAF) 
5 was obtained from GenBank. Accession No: U57317.2, (SEQ ID NO:2): 

1 MSEAGGAGPG GCGAGAGAGA GPGALPPQPA ALPPAPPQGS PCAAAAGGSG AC GPATAVAA 
61 AGTAEGPGGG GSARIAVKKA QLRSAPRAKK LE KLGVYS AC KAEKSCKCNG WKNPOTSPTP 
121 PRADLQQIIV SLTESCRSCS HALAAHVSHL ENVSEEEMNR LLGIVLDVEY LFTCVHKEED 
181 ADTKQVYFYL FKLLRKSILQ RGKPWEGSL EKKPPFEKPS IEQGWNFVQ YKFSHLPAKE 
10 241 RQTIVELAKM FLNRINYWHL EAPSQRRLRS PNDDISGYKE NYTRWLCYCN VPQFCDSLPR 

3 01 YETTQVFGRT LLRSVFTVMR RQLLEQARQE KDKLPLEKRT LILTHFPKFL SMLEEEVYSQ 
361 NSPIWDQDFL SASSRTSQLG IQTVINPPPV AGTISYNSTS SSLEQPNAGS SSPACKASSG 

4 21 LEANPGEKRK MTDSHVLEEA KKPRVMGDIP MELINEVMST I TV PAAMLGP ETNFLSAHSA 
481 RDEAARLEER RGVIEFHWG KTSLNQKPNKK ILMWLVGLQM VFSHQLPRMP KEYITRLVFD 

15 541 PKHKTLALIK DGRVIGGICF RMFPSQGFTE IVFCAVTSNE QVKGYGTHLM NHLKEYHIKH 
601 DILNFLTYAD EYAIGYFKKQ GFSKEIKIPK TKYVGYI KDY EGATLMGCEL NPRIPYTEFS 
661 VIIKKQKEII KKLIERKQAQ IRKVYPGLSC FKDGVRQIPI ESIPGIRETG WKPSGKEKSK 
721 EPRDPDQLYS TLKSILQQVK SHQSAWPFME PVKRTEAPGY YEVIRFPMDL KTMSERLKNR 
7 81 YYVSKKLFMA DLQRVFTNCK EYNAAESEYY KCANILEKFF FSKIKEAGLI DK 

20 

Results 

The P/CAF bromodomain represents an extensive family of bromodomains (Figure 1). 
A large number of long-range nuclear Overhauser enhancement (NOE)-derived 

25 distance restraints were identified in the NMR data of the P/CAF bromodomain, 

yielding a well-defined three-dimensional structure (Figures 2 A -2D). Table 1 shows 
the NMR chemical shift assignment of the P/CAF bromodomain. Table 2 shows the 
Unambiguous NOE-derived distance restraints. Table 3 shows the Ambiguous NOE- 
derived distance restraints. Table 4 shows the Hydrogen bond restraints. The NMR 

30 structure coordinates of the P/CAF bromodomain in the free and complexed to acetyl- 
histamine are shown in Tables 5 and 6, respectively. 

The structure consists of a four-helix bundle (helices a 2 , a A? a B? and oc c ) with a 
left-handed twist, and a long intervening loop between helices a z and a A (termed the 
35 ZA loop, Figure 2E). The four amphipathic a-helices are packed tightly against one 
another in an antiparallel manner, with crossing angles for adjacent helices of -16-20°. 
The up-and-down four-helix bundle can adapt two topological folds with opposite 
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handedness (Figures 2F-2G). The right-handed four-helix bundle fold occurs more 
commonly and is seen in proteins such as hemerythrin and cytochrome b 562 . The 
left-handed fold of the bromodomain structure is less common, but also observed in 
proteins such as cytochrome b 5 and T4 lysozyme [Richardson, J., Adv.Protein Chem., 
5 34:167-339 (1989); Presnell and Cohen, Proc. Natl Acad. Set USA 86:6592-6596 
(1989)]. This topological difference arises from the orientation of the loop between the 
first two helices (Fig. 2F-2G). The right-handed four-helix bundle proteins have a 
relatively short hairpin-like connection between the first two helices, which makes the 
"preferred" turn to the right at the top of the first helix [Richardson, J., Adv.Protein 

10 Chem., 34:167-339 (1989); Presnell and Cohen, Proc. Natl Acad. Set USA 86:6592- 
6596 (1989); Weber and Salemme, Nature 287:82-84 (1980)]. In contrast, proteins 
with the left-handed fold usually have a long loop after the first helix and often contain 
additional secondary structural elements at the base of the helix bundle [Richardson, J., 
Adv.Protein Chem., 34:167-339 (1989); Presnell and Cohen, Proc. Natl. Acad. Set 

15 USA 86:6592-6596 (1989)]. In the bromodomain structure, this long ZA loop has a 
defined conformation and is packed against the loop between helices a B and a c (termed 
the BC loop) to form a hydrophobic pocket. These tertiary interactions between the 
two loops appear to favor the left turn of the ZA loop, resulting in the left-handed 
four-helix bundle fold of the bromodomain. The hydrophobic pocket formed by loops 

20 ZA and BC is lined by residues Val752, Ala757, Tyr760, Val763, Tyr802 and Tyr809 
(Fig. 2H) 5 and appears to be a site for protein-protein interactions (see below). The 
pocket is located at one end of the four-helix bundle, opposite to the N- and C-termini 
of the protein. Interestingly, the ZA loop varies in length amongst different 
bromodomains, but almost always contains residues corresponding to Phe748, Pro751, 

25 Pro758, Tyr760, and Pro767 (Figure 1). The conservation of these residues within the 
ZA loop as well as residues within the a-helical regions implies a similar left-handed 
four-helix bundle structure for the large family of bromodomains (Fig. 1). 

The modular bromodomain structure supports the idea that bromodomain can act as a 
30 functional unit for protein-protein interactions. The observation that bromodomains 
are found in nearly all known nuclear HATs (A-type) that are known to promote 
transcription-related acetylation of histones on specific lysine residues, but not present 
in cytoplasmic HATs (B-type), prompted the determination of whether bromodomains 
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can interact with acetyl-lysine (AcK). The NMR titration of the P/CAF bromodomain 
were performed with a peptide (SGRGKGG-acK-GLGK) derived from histone H4, in 
which Lys8 is acetylated (Lys8 is the major acetylation site in H4 for GCN5, a yeast 
homologue of P/CAF). Remarkably, the bromodomain could indeed bind the AcK 

5 peptide. Moreover, this interaction appeared to be specific, based on the 15 N-HSQC 
spectra which showed that only a limited number of residues underwent chemical shift 
changes as a function of peptide concentration (Figure 3 A). Conversely, the NMR 
titration of the bromodomain with a non-acetylated, but otherwise identical H4 peptide, 
showed no noticeable chemical shift changes, demonstrating that the interaction 
1 0 between the bromodomain and the lysine-acetylated H4 peptide was dependent upon 
acetylation of lysine. The dissociation constant (K u ) for the AcK peptide was 
estimated to be 346 ± 54 jxM. This binding is likely reinforced through additional 
interactions between bromodomain-containing proteins and target proteins. Notably, 
many chromatin-associated proteins contain two or multiple bromodomains (Figure 1). 

1 5 Indeed, binding with another lysine-acetylated peptide (RKSTGG-acK-APRKQ) 
derived from the major acetylation site on histone H3 (residues 9-20) was also 
observed. Together, these data demonstrate that the P/CAF bromodomain has the 
ability to bind AcK peptides in an acetylation dependent manner. 

20 Intriguingly, the bromodomain residues that exhibited the most significant J H and 15 N 
chemical shift changes on peptide binding are located near the hydrophobic pocket 
between the ZA and BC loops (Figure 3B). Because a similar pattern of amide 
chemical shift changes was observed with the two different AcK-containing peptides, 
it was surmised that the hydrophobic cavity is the primary binding site for AcK. This 
25 hypothesis was further supported by titration with acetyl-histamine, which mimics the 
chemical structure of the AcK side-chain (Figure 3C). Both 15 N- and 13 C-HSQC 
spectra showed that interaction with acetyl-histamine was also acetylation-dependent, 
involving the same set of residues that showed chemical shift perturbations with 
similar concentration dependence. It should be noted that the bromodomain did not 
30 bind to the amino acids acetyl-lysine or acetyl-histidine alone, possibly due to the 
presence of the charged amino, carboxyl, or caboxylate group adjacent to the acetyl 
moiety (Figure 3C). Taken together, these results strongly suggest that the P/CAF 
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bromodomain can interact with acetyl-lysine-containing proteins in a specific manner, 
and that this interaction is localized to the bromodomain hydrophobic cavity. 

To identify the key residues involved in bromodomain- AcK recognition, the NMR 
5 structure of the P/CAF bromodomain in complex with acetyl-histamine was elucidated. 
As anticipated, the acetylated moiety binds in the bromodomain hydrophobic pocket 
(Figure 4). The intermodular interactions are largely hydrophobic in nature, with the 
methyl group of acetyl-histamine making extensive contacts with the side-chains of 
Val752, Ala757, and Tyr760, and the methylene groups of acetyl-histamine displaying 
1 0 specific NOEs to Val752, Ala757, Tyr760, Tyr802, and Tyr809. No intermodular 
NOEs were observed for the imidazole ring of acetyl-histamine. From the spectral 
analysis it is clear that the structure of the bromodomain is very similar in both the free 
and complex forms. 

15 It is worth noting that the bromodomain- AcK recognition is reminiscent of the 

interactions between the histone acetyltransferase Hatl and acetyl-CoA. Although the 
binding pockets of these two otherwise structurally unrelated proteins are composed of 
different secondary structural elements, the nature of acetyl-lysine recognition has 
striking similarities. In particular, Tyr809, Tyr802, Tyr760, and Val752 in the 

20 bromodomain appear to be related to Phe220, Phe261, Val254, and Ile217 of Hatl, 
respectively, in their interactions with the acetyl moiety. This observation may suggest 
an evolutionary convergent mechanism of acetyl-lysine recognition between 
bromodomains and histone acetyltransferases. 

25 To determine the relative contributions of residues within the hydrophobic cavity in 
bromodomain- AcK binding, site-directed mutagenesis was used to alter residues 
Tyr809, Tyr802, Tyr760, and Val752 (Table 7). 
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Table 7. Structural and Functional Analysis of the P/CAF Bromodomain 
Mutants 



5 


Bromodomain 
Proteins 


Structural Integrity a j 


H4 AcK-Peptide Binding 




Wild-Type j 


++++ 1 


346 ± 54 


10 


Tyr809Ala j 


MM 


No Binding c 




Tyr802Ala 


-H-+ 


> 10,000 d 




Tyr760Ala 


j +++ 


> 10,000 


15 


Val752Ala 




> 10,000 



a. The effects of mutations on the structural integrity of the bromodomain were 
assessed by using the 15 N-HSQC spectra. The amide l W ls N resonances of the mutant 

20 proteins were compared to those of the wild-type bromodomain to determine if the 
particular mutations lead to global or local structure disruption. Severe 
line-broadening of the amide resonances would indicate protein conformational 
exchange due to a decrease of structure stability resulting from point mutations. 
Structural integrity of the mutant proteins is expressed here relative to that of the 

25 wild-type, using the signs of "++++" for as stable as the wild-type, "+++" for mildly 
destabilized, for moderately destabilized, and "-" for completely unfolded. 

b. The ligand binding affinity (K D ) of the bromodomain proteins was estimated by 
following chemical shift changes of amide peaks in the 15 N-HSQC spectra as a 

30 function of the ligand concentration. 

c. No detectable ligand binding observed in the NMR titration. 

d. Ligand binding affinity was significantly reduced and beyond the limit for reliable 
3 5 measurements by NMR titration. 
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Substitution of Ala for Tyr809 completely abrogated the bromodomain binding to the 
lysine-acetylated H4 peptide, while the Tyr802Ala, Tyr760Ala, and Val752Ala 
mutants had significantly reduced ligand binding affinity. To assess whether these 
mutations disrupted the overall bromodomain fold, the 15 N-HSQC spectra of the 
5 mutants was compared to that of the wild-type protein. For the Tyr809Ala mutant, the 
amide chemical shifts were only affected for a few residues near the mutation site. 
However, mutations of the other residues in the hydrophobic binding pocket perturbed 
the local protein conformation to greater extents, particularly the ZA loop (Table 7). 
Thus, the NMR structural analysis and the mutagenesis studies show that Tyr809, 

10 which is structurally supported by Trp746 and Asn803 (Fiure 4), is essential for the 
bromodomain interaction with the acetyl group of acetyl-lysine, while residues of 
Tyr802, Tyr760, and Val752 likely play both structural and functional roles in the 
recognition. These residues are highly conserved throughout the bromodomain family 
(Figure 1), suggesting that recognition of acetyl-lysine may be a feature of 

15 bromodomains, in general. Therefore, Val752, Ala757, Tyr760, Tyr802, Asn803, and 
Tyr809 are key amino acid residues for the P/CAF bromodomain binding to acetyl- 
lysine. 
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Table 8: Amino Acid Sequences of Bromodomains Identified in Figure 1 





PROTEIN 
BD 


SEQID 
NO: 


GenBank 
Acc. No. 


PROTEIN 
BD 


SEQID 
NO: 


GenBank 
Acc. No. 




hsp/CAF 


7 


U57317 


dmFSH-2 


25 




5 


hsGCN5 


8 


U57136 


scBDFl-2 


26 






ttP55 


9 


U47321 


hsBR140 


27 


JC2069 




scGCN5 


10 


Q03330 


hsSMAP 


28 


X87613 




hsP300 


11 


A54277 


ggPBl-1 


29 


X90849 




hsCBP 


12 


S39162 


ggPBl-2 


30 




O 10 


mmCBP 


13 


S39161 


ggPBl-3 


31 




U1 


ceYNJl 


14 


P34545 


ggPBl-4 


32 




v% 


hsCCGl-1 


15 


P21675 


ggPBl-5 


33 






msCCGl-1 


16 


D26114 


spBRO-1 


34 


S54260 




hsCCGl-2 


17 




spBRO-2 


35 




s 

O 15 


msCCGl-2 


18 




hsSNF2a 


36 


S45251 




hsRing3-l 


19 


P25440 


hsBRGl 


37 


S39039 




hsORFX-1 


20 


D26362 


ggBRM 


38 


X91638 




dmFSH-1 


21 


PI 3709 


ggBRGl 


39 


X91637 




scBDFl-1 


22 


P35817 


hsTIFlb 


40 


X97548 


20 


hsRing3-2 


23 




mmTIFlb 


41 


X99644 




hsORFX-2 


24 




mmTIFla 


42 


S78219 



The present invention is not to be limited in scope by the specific embodiments 
described herein. Indeed, various modifications of the invention in addition to those 
25 described herein will become apparent to those skilled in the art from the foregoing 
description and the accompanying figures. Such modifications are intended to fall 
within the scope of the appended claims. 
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It is further to be understood that all base sizes or amino acid sizes, and all molecular 
weight or molecular mass values, given for nucleic acids or polypeptides are 
approximate, and are provided for description. 

Various publications are cited herein, the disclosures of which are hereby incorporated 
by reference herein in their entireties. 



WHAT IS CLAIMED IS : 



63 



1 1 . An isolated nucleic acid encoding a peptide consisting of about 21 to 40 

2 amino acids comprising a ZA loop of a bromodomain comprising the amino acid 

3 sequence of SEQ ID NO:3 . 

1 2. The isolated nucleic acid of Claim 1 further comprising a heterologous 

2 nucleotide sequence. 

1 3. An isolated nucleic acid encoding a peptide consisting of about 21 to 40 

2 amino acids comprising a ZA loop of a bromodomain, wherein the bromodomain has 

3 an amino acid sequence selected from the group consisting of SEQ ID NOs. 7, 8, 9, 

4 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 

5 33, 34, 35, 36 5 37, 38, 39, 40, 41, and 42. 

1 4. The isolated nucleic acid of Claim 3 further comprising a heterologous 

2 nucleotide sequence. 

15. A peptide consisting of about 21 to 40 amino acids comprising a ZA loop of 

2 a bromodomain comprising the amino acid sequence of SEQ ID NO:3. 

1 6. A fusion protein or peptide comprising the peptide of Claim 5. 

17. A peptide consisting of about 21 to 40 amino acids comprising a ZA loop of 

2 a bromodomain, wherein the bromodomain has an amino acid sequence selected from 

3 the group consisting of SEQ ID NOs. 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 

4 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36 , 37, 38, 39, 40, 41, and 

5 42. 
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18. A fusion protein or peptide comprising the peptide of Claim 7. 

19. An antibody raised against the peptide of Claim 7 or raised against an 
2 antigenic fragment thereof 

1 10. An antibody raised against the peptide of Claim 5 . 

111. A method of identifying a compound that modulates the affinity of a 

2 bromodomain for a ligand that comprises an acetyl-lysine, 

3 said method comprising: 

4 (a) contacting the bromodomain and the ligand in the presence of the 

5 compound, wherein the bromodomain and the ligand bind in the absence of the 

6 compound; and 

7 (b) measuring the affinity of the bromodomain for the ligand; wherein 

8 a compound is identified as a compound that modulates the affinty of the 

9 bromodomain for the ligand when there is a change in the affinity of the 
1 0 bromodomain for the ligand in the presence of the compound. 

1 12. The method of Claim 11, wherein the affinity of the bromodomain for the 

2 ligand increases in the presence of the compound and wherein the compound is 

3 identified as a bromodomain-ligand complex promoting agent. 

1 13. The method of Claim 1 1 , wherein the affinity of the bromodomain for the 

2 ligand decreases in the presence of the compound and the compound is identified as an 

3 inhibitor. 

1 14. The method of Claim 1 1 , wherein the compound is selected by performing 

2 rational drug design with the set of atomic coordinates obtained from one or more of 
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3 Tables 1-6, wherein said selecting is performed in conjunction with computer 

4 modeling. 

1 15. The method of Claim 1 1 , wherein the compound is selected by performing 

2 rational drug design with the set of atomic coordinates obtained from a set of atomic 

3 coordinates defining the three-dimensional structure of a bromodomain consisting of 

4 the amino acid sequence of SEQ ID NO:7, wherein said selecting is performed in 

5 conjunction with computer modeling. 

116. A method of identifying a compound that modulates the stability of a 

2 bromodomain-acetyl-lysine binding complex comprising: 

3 (a) contacting the bromodomain-acetyl-lysine binding complex in the 

4 presence of the compound wherein the bromodomain-acetyl-lysine binding complex 

5 forms in the absence of the compound; and 

6 (c) measuring the stability of the bromodomain-acetyl-lysine binding 

7 complex; wherein a compound is identified as a compound that modulates the stability 

8 of the bromodomain-acetyl-lysine binding complex, when there is a change in the 

9 stability of the bromodomain-acetyl-lysine binding complex in the presence of the 
10 compound. 

1 17. The method of Claim 16, wherein the stability of the bromodomain-acetyl- 

2 lysine binding complex increases in the presence of the compound and wherein the 

3 compound is identified as a stabilizing agent. 

1 18. The method of Claim 1 6, wherein the stability of the bromodomain-acetyl- 

2 lysine binding complex decreases in the presence of the compound and the compound 

3 is identified as an inhibitor. 
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1 19. The method of Claim 16, wherein the compound is selected by performing 

2 rational drug design with the set of atomic coordinates obtained from one or more of 

3 Tables 1-6, wherein said selecting is performed in conjunction with computer 

4 modeling. 

1 20. The method of Claim 16, wherein the compound is selected by performing 

2 rational drug design with the set of atomic coordinates obtained from a set of atomic 

3 coordinates defining the three-dimensional structure of a bromodomain consisting of 

4 the amino acid sequence of SEQ ID NO:7, wherein said selecting is performed in 

5 conjunction with computer modeling. 

121. A method of identifying a binding partner for a protein that comprises an 

2 acetyl-lysine said method comprising: 

3 (a) contacting the protein with a polypeptide comprising a 

4 bromodomain; and 

5 (b) determining whether the polypeptide binds to the protein; wherein 

6 a binding partner for a protein is identified when polypeptide binds to the protein. 

1 22. The method of Claim 21 wherein the bromodomain has an amino acid 

2 sequence from selected from the group consisting of SEQ ID NOs. 7, 8, 9, 10, 1 1, 12, 

3 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 

4 36,37,38,39,40,41 and 42. 

1 23 . An agent that can inhibit the binding of a bromodomain with a protein 

2 comprising an acetyl-lysine selected from the group consisting of : ISYGR-^4cX- 

3 KRRQRR (SEQ ID NO:4), ARKSTGG-Aci^APRKQL (SEQ ID NO:5) and 

4 QSTSRHK-^ciT-LMFKTE (SEQ ID NO:6). 
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AB STRACT OF THE INVENTION 



The present invention provides the structural determination of a bromodomain 
determined by NMR spectroscopy. The present invention also provides a binding 
partner for the bromodomain. In addition, the present invention provides 
methodology for related drug discovery using high throughput drug screening or 
structure based rational drug design using the three-dimensional data. 
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Table 1 



NMR Chemical 
Shift Assignment 
of the P/CAF 
Bromodoanain 



RES_ID 715 

RESJTYPE GLY 

SPIN_SYSTEM_ID 1 

HETEROGENEITY 100 
END_RES_DEF 

RES_ID 716 

RESJTYPE SER 

SPIN_SYSTEM_ID 2 

HETEROGENEITY 100 
EKD_RES_DEF 

RES_ID 717 

RESJTYPE HIS 

SPIN_SYSTEM_ID 3 

HETEROGENEITY 100 
END_RES_DEF 

RES_ID 718 

RESJTYPE MET 

SPIN_SYSTEM_ID 4 

HETEROGENEITY 100 
END_RES_DEF 

RES_ID 719 

RES_TYPE SER 

SPIN_SYSTEM_ID 5 

HETEROGENEITY 100 
END_RES_DEF 

RES_ID 720 
RES_TYPE LYS 
SPIN__SYSTEM ID 6 
HETEROGENEITY 100 
CA 56.296000 
HA 4.361000 
CB 33.140000 
HB1 1.832000 
HB2 1.684000 
CG 25.430000 
HG1 1.585000 
HG2 1.433000 
CD 29.834000 
HD1 1.703000 
CE 41.960000 
HE1 3.003000 
END_RES_DEF 

RES_ID 721 
RESJTYPE GLU 
SPIN_SYSTEM_ID 7 
HETEROGENEITY 100 

N 122.990000 

HN 8.317000 

CA 54.620000 

HA 4.540000 

CB 29.830000 

HB1 2.024000 

HB2 1.893000 

CG 35.893000 

HG1 2.271000 
END_RES_DEF 

RES_ID 722 
RES_TYPE PRO 
SPIN_SYSTEM_ID 8 
HETEROGENEITY 100 

CA 63.430000 

HA 4.393000 

CB 32.030000 

HB1 2.224000 

HB2 1.880000 

CG 27.630000 

HG1 2.028000 

CD 50.760000 

HD2 3.656000 

HD1 3.800000 
END_RES_DEF 

RES_ID 723 
RESJTYPE ARG 
SPIN_SYSTEM ID 9 



HETEROGENEITY 100 
N 121.192000 
HN 8.416000 
CA 63.430000 
HA 4.331000 
CB 30.930000 
HB1 1.815000 
HB2 1.762000 
CG 27.630000 
HG1 1.681000 
CD 43.603000 
HD1 3-161000 
END_RES_DEF 

RES_ID 724 
RES_TYPE ASP 
SPIN_SYSTEM_ID 10 
HETEROGENEITY 100 
N 122.012000 
HN 8.273000 
CA 52.415000 
HA 4.874000 
CB 41.400000 
HB1 2.754000 
HB2 2.692000 
END_RES_DEF 

RES_ID 725 
RESJTYPE PRO 
SPIN_SYSTEM_ID 11 
HETEROGENEITY 100 
CA 65.080000 
HA 4.329000 
CB 32.590000 
HB1 2.326000 
HB2 1.973000 
CG 27.632000 
HG1 2.028000 
CD 51.310000 
HD1 3.866000 
END_RES_DEF 

RES_ID 726 
RESJTYPE ASP 
SPIN_SYSTEM_ID 12 
HETEROGENEITY 100 

N 119.716000 

HN 8.397000 

CA 55.720000 

HA 4.692000 

CB 40.550000 

HB1 2.792000 

HB2 2.730000 
END_RESJDEF 

RES_ID 727 
RESJTYPE GLN 
SPIN_SYSTEM_ID 13 
HETEROGENEITY 100 

N 121.356000 

HN 8.196000 

CA 55.920000 

HA 4.163000 

CB 28.730000 

HB1 2.148000 

CG 34.240000 

HG1 2.524000 

HG2 2.371000 
END_RES_DEF 

RES_ID 728 
RESJTYPE LEU 
SPIN_SYSTEM_ID 14 
HETEROGENEITY 100 

N 121 .356000 

HN 8.210000 

CA 58.473000 

HA 4.045000 

CB 41.400000 

HB1 1.847000 

HB2 1.555000 

CG 27.080000 

HG 1.480000 

CD1 25.970000 

HD1# 0.794000 

CD2 23.226000 

HD2# 0.786000 
END_RES_DEF 

RES_ID % 729 

RESJTYPE TYR 

SPIN_SYSTEM_ID 15 

HETEROGENEITY 100 

N 119.060000 

HN 8.021000 



CA 62.320000 

HA 4.038000 

CB 38.640000 
HB1 3.211000 
HB2 3.024000 
CD1 134.350000 
HD1 7.053000 
CE1 119.481000 
HE1 6.882000 
END_RES_DEF 

RES_ID 73 0 

RESJTYPE SER 
SPIN_SYSTEM_ID 16 
HETEROGENE I TY 100 
N 112.173000 
HN 8.167000 
HA 3.920000 
HB1 3.995000 
END__RES_DEF 

RES_ID 731 
RESJTYPE THR 
SPIN_SYSTEM_ID 17 
HETEROGENE I TY 100 
N 120.372000 
HN 8.059000 
CA 66.730000 
HA 3.924000 
CB 68.930000 
HB 4.247000 
CG2 21.570000 
HG2# 1.142000 
END__RES_DEF 

RES_ID 732 
RESJTYPE LEU 
SPIN_SYSTEM_ID 18 
HETEROGENEITY 100 
N 120.536000 
HN 8.460000 
CA 57.920000 
HA 3.289000 
CB 39.750000 
HB1 1.532000 
HB2 0.294000 
CG 24.880000 
HG 1.683000 
CD1 25.429000 
HD1# 0.469000 
CD2 19.921000 
HD2# -0.193000 
END_RES_DEF 

RES_ID 733 
RESJTYPE LYS 
SPIN_SYSTEM_ID 19 
HETEROGENEITY 10 0 

N 118.568000 

HN 8.563000 

CA 60.125000 

HA 3.679000 

CB 32.588000 

HB1 1.729000 

HB2 1.360000 

CG 24.880000 

HG1 1.280000 

CD 29.835000 

HD1 1.585000 

CE 41.960000 

HE1 2.918000 
END_RES_DEF 

RES_ID 734 
RESJTYPE SER 
SPIN_SYSTEM_ID 20 
HETEROGENEITY 100 

N 113.157000 

HN 7.540000 

CA 61.227000 

HA 4.281000 

CB 63.879000 

HB1 4.060000 
END_RES_DEF 

RES_ID 735 
RESJTYPE ILE 
SPIN_SYSTEM_ID 21 
HETEROGENEITY 100 

N 120-700000 

HN 7.951000 

CA 65.080000 

HA 3.786000 

CB 38.095000 

HB 1.879000 



CGI 28.733000 
HG11 1.748000 
HG12 1.052000 
CG2 17.168000 
HG2# 1-003000 
CD1 13.863000 
HD1# 0.619000 
END_RES_DEF 

RES_ID 736 
RES_TYPE LEU 
SPIN_SYSTEM_ID 22 
HETEROGENE ITY 100 
N 119.880000 
HN 8.841000 
CA 58 .473000 
HA 4.090000 
CB 41.950000 
HB1 2.090000 
HB2 1.703000 
CG 27.330000 
HG 1.759000 
CD1 26.S30000 
HD1# 1.061000 
CD2 23.776000 
HD2# 0.977000 
END_RES_DEF 

RES_ID 737 
RESJTYPE GLN 
SPIN_SYSTEM_ID 23 
HETEROGENEITY 100 
N 117.256000 
HN 8.505000 
CA 59.020000 
HA 4.032000 
CB 28.182000 
HB1 2.327000 
HB2 2.263000 
CG 34.240000 
HG1 2.536000 
HG2 2.461000 
END_RES_DEF 

RES__ID 738 
RESJTYPE GLN 
SPIN_SYSTEM_ID 24^' 
HETEROGENEITY 1 00 

N 118.896000 

HN 8.033000 

CA 59.574000 

HA 4.196000 

CB 29.835000 

HB1 2.482000 

HB2 2.469000 

CG 35.342000 

HG1 2.840000 

HG2 2.467000 

NE2 110.369000 

HE21 7.022000 

HE22 6.916000 
END_RES_DEF 

RES_ID 739 
RESJTYPE VAL 
SPIN_SYSTEM_ID 25 
HETEROGENEITY 100 
N 119.716000 
HN 8.526000 
CA 67.830000 
HA 3,844000 
CB 32.030000 
HB 2.384000 
CGI 23.330000 
HG1# 1.183000 
CG2 22.120000 
HG2# 1.033000 
END_RES_DEF 

RES_ID 740 
RESJTYPE LYS 
SPIN_SYSTEM_ID 26 
HETEROGENEITY 100 

N 114.633000 

HN 8.572000 

CA 59.574000 

HA 3.886000 

CB 32.380000 

HB1 1.873000 

HG1 1.022000 

HD1 1.520000 
END RES DEF 



RES_ID 
RES TYPE 



741 
SER 



SPIN_SYSTEM_ID 27 
HETEROGENEITY 100 
N 110.369000 
HN 7.557000 
CA 59.024000 
HA 4.448000 
CB 63.980000 
HB1 4 .004000 
END_RES_DEF 

RES_ID 742 
RESJTYPE HIS 
SPIN_SYSTEM_ID 28 
HETEROGENEITY 100 
N X25. 619000 
HN 7.536000 
CA 58.473000 
HA 3.967000 
CB 32.588000 
HB1 2.990000 
HB2 2.799000 
CD 2 118.930000 
^HD2 4.978000 
*CE1 138.755000 
HE1 7.522000 
END_RES_DEF 

RES__ID 743 
RES_TYPE GLN 
SPIN_SYSTEM_ID 29 
HETEROGENE ITY 100 
N 128.571000 
HN 8.543000 
CA 59.125000 
HA 4.209000 
CB 29.834000 
HB1 2.111000 
CG 33-690000 
HG1 2.390000 
NE2 112.173000 
HE21 7.581000 
HE22 6.870000 
END_RES_DEF 

RES_ID 744 
RESJTYPE SER 
SPIN_SYSTEM_ID 30 
HETEROGENEITY 100 
K 119.060000 
HN 11.668000 
CA 60.125000 
HA 4.838000 
CB 63.980000 
HB1 4.334000 
HB2 3.926000 
END_RES_DEF 

RES_ID 745 
RESJTYPE ALA 
SPIN_SYSTEM_ID 31 
HETEROGENEITY 100 

N 117.584000 

HN 7.868000 

CA 53.510000 

HA 4.396000 

CB 20.470000 

HB# 1.688000 
END_RES_DEF 

RES_ID 746 
RES_TYPE TRP 
SPIN_SYSTEM_ID 32 
HETEROGENEITY 100 
N 116.600000 
HN 7.135000 
CA 60.691000 
HA 4.368000 
CB 27.630000 
HB1 3.594000 
HB2 3.351000 
CD1 128.843000 
HD1 7.897000 
NE1 110.861000 
HE1 10.474000 
CE3 122.234000 
HE3 7.336000 
C22 116.177000 
HZ2 7.382000 
CZ3 123.336000 
HZ3 7.197000 
CH2 126.089000 
HH2 7.150000 
END RES DEF 



RES ID 



747 



RES_TYPE PRO 
SPIK_SYSTEM_ID 33 
HETEROGENE I TY 100 
CA 64.531000 
HA 3.756000 
CB 29.835000 
HB1 0.487000 
HB2 -0.783000 
CG 26.530000 
HG1 0.233000 
HG2 -0.931000 
CD 50.212000 
HD2 1.567000 
HD1 2.177000 
END_RES_DEF 

RES_ID 748 
RES_TYPE PHE 
SPIN_SYSTEM_ID 34 
HETEROGENEITY 100 
N 113.321000 
HN 7.585000 
CA 55.719000 
HA 4.930000 
CB 39.202000 
. HB1 3.491000 
HB2 2.532000 
CD1 133.248000 
HD1 7.099000 
HE1 7.174000 
HZ 7.296000 
END_RES_DEF 

RES_ID 749 
RES_TYPE MET 
SPIN_SYSTEM_ID 35 
HETEROGENE ITY 100 
N 117.748000 
HN 7.115000 
CA 56.820000 
HA 4.286000 
CB 32.590000 
HB1 2.233000 
HB2 2.174000 
CG 33.140000 
HG1 2.851000 
CE 17.168000 
HE# 2.175000 
END_RES_DEF 

RES_ID 750 
RESJTYPE GLU 
SPIN_SYSTEM_ID 36 
HETEROGENEITY 100 

N 113.813000 

HN 7.709000 

CA 53.516000 

HA 4.849000 

CB 31.487000 

HB1 2.091000 

HB2 1.730000 

CG 35.893000 

HG1 2.164000 
END_RES_DEF 

RES_ID 751 
RESJTYPE PRO 
SPIN_SYSTEM_ID 37 
HETEROGENEITY 100 

CA 62.879000 

HA 4.242000 

CB 32.040000 

HB1 2.328000 

HB2 1.683000 

CG 27.080000 

HG1 2.126000 

HG2 1.978000 

CD 50.763000 

HD1 3.670000 
END_RES_DEF 

RES_ID 752 
RESJTYPE VAL 
SPINAS YSTEM_ID 38 
HETEROGENE ITY 100 
N 124.450000 
HN 8.124000 
CA 63 .430000 
HA 3.553000 
CB 32.580000 
HB 1.145000 
CGI 21.573000 
HG1# 0.464000 
CG2 21.573000 
HG2# 0.169000 



END__RES_DEF 

RES_ID 753 
RESJTYPE LYS 
SPIN_SYSTEM_ID 39 
HETEROGENEITY 100 
N 129.883000 
HN 9.045000 
CA 56.310000 
HA 4.370000 
CB 32.880000 
HB1 1.873000 
HG1 1.435000 
HD1 1.673000 
HE1 2.985000 
END_RES_DEF 

RES_ID 754 
RESJTYPE ARG 
SPIN_SYSTEM_ID 40 
HETEROGENEITY 100 

N 120.208000 

HN 8.054000 
END_RES_DEF 

RES_ID 755 
RESJTYPE THR 
SPIN_SYSTEM_ID 41 
HETEROGENEITY 100 
CA 63.430000 
HA 4.038000 
CB 68.380000 
HB 4.293000 
CG2 22.670000 
HG2# 1.267000 
END_RES_DEF 

RES_ID 756 
RESJTYPE GLU 
SPIN_SYSTEM_ID 42 
HETEROGENEITY 100 
N 118.732000 
HN 7.209000 
CA 56.270000 
HA 4.448000 
CB 30.930000 
HB1 2.174000 
HB2 2.000000 
CG 36.440000 
HG1 2.292000 
END_RES_DEF 

RES_ID 757 
RESJTYPE ALA 
SPIN_SYSTEM_ID 43 
HETEROGENE ITY 100 
N 122.504000 
HN 7.379000 
CA 50.220000 
HA 4.937000 
CB 19.370000 
HB# 1.082000 
END_RES_DEF 

RES_ID 758 
RESJTYPE PRO 
SPIN_SYSTEM_ID 44 
HETEROGENEITY 100 

CA 65.080000 

HA 4.496000 

CB 31.487000 

HB1 2.374000 

HB2 2.027000 

CG 27.632000 

HG1 2. 122000 

HG2 2.038000 

CD 50.212000 

HD2 3.515000 

HD1 3.717000 
END_RES_DEF 

RES_ID 759 

RESJTYPE GLY 

SPIN_SYSTEM_ID 45 

HETEROGENEITY 100 
END_RES_DEF 

RES_ID 760 
RESJTYPE TYR 
SPIN_SYSTEM_ID 46 
HETEROGENEITY 100 

N 122.504000 

HN 7.945000 

CA 62.328000 

HA 3.536000 



CB 39.750000 

HB1 2.689000 

HB2 2.487000 

CD1 133.799000 

HD1 5.120000 

CE1 118.379000 

HE1 6.070000 
END_RES_DEF 

RES_ID 761 
RESJTYPE TYR 
SPIN_SYSTEM_ID 47 
HETEROGENE ITY 100 
N 113.157000 
HN 8.225000 
CA 60.676000 
HA 4.101000 
CB 37.550000 
HB1 3.189000 
HB2 2.801000 
CD1 134.901000 
HD1 7 .342000 
CE1 118.930000 
HE1 6.646000 
END_RES_DEF 

RES_ID 762 
RESJTYPE GLU 
SPIN_SYSTEM_ID 48 
HETEROGENEITY 100 
N 117.912000 
HN 7. 702000 
CA 57.922000 
HA 4.209000 
CB 29.480000 
HB1 2.086000 
CG 37.545000 
HG1 2.325000 
HG2 2.265000 
END_RES_DEF 

RES_ID 763 
RESJTYPE VAL 
SPIN_SYSTEM_ID 4 9 
HETEROGENEITY 100 
N 115.453000 
HN 7.135000 f 
CA 63.430000 
HA 4.077000 
CB 33.690000 
HB 2.015000 
CGI 21.020000 
HG1# 1.045000 
CG2 21.574000 
HG2# 0.991000 
END_RES_DEF 

RES_ID 764 
RESJTYPE ILE 
SPIN_SYSTEM_ID 50 
HETEROGENEITY 100 
N 122.832000 
HN 7.947000 
CA 57.920000 
HA 3.916000 
CB 34,240000 
HB 1.205000 
CGI 24.878000 
HG11 0.798000 
HG12 0.216000 
CG2 16.617000 
HG2# 0.380000 
CD1 9. 457000 
HD1# 0.537000 
END_RES_DEF 

RES_ID 765 
RESJTYPE ARG 
SPIN_SYSTEH_ID 51 
HETEROGENEITY 100 

N 125.291000 

HN 7.749000 

CA 57.371000 

HA 3.875000 

CB 30.936000 

HB1 1.388000 

HB2 1.211000 

CG 27.080000 

HG1 1.319000 

HG2 1.173000 

CD 43.052000 

HD1 2.971000 
END RES DEF 



RES ID 



766 



2 



RESJTYPE SER 
SPIN_SYSTEM_ID 52 
HETEROGENEITY 100 
N 116.600000 
HN 8.387000 
CA 54.618000 
HA 4.984000 
CB 38.640000 
HB1 3.034000 
HB2 2.907000 
END_RES_DEF 

RES_ID 767 
RESJTYPE PRO 
SPIN_SYSTEM_ID 53 
HETEROGENEITY 100 
CA 63.429000 
HA 4.083000 
CB 32.588000 
HB1 2.209000 
CG 28.180000 
HG1 2.177000 
HG2 1.883000 
*CD 50.763000 
HD2 3.390000 
HD1 3.623000 
END_RES_DEF 

RES_ID 768 
RESJTYPE MET 
SPIN_SYSTEM_1D 54 
HETEROGENEITY 100 
N 119.060000 
HN 8.430000 
CA 54.067000 
HA 4.935000 
CB 31.487000 
HB1 1.989000 
HB2 1.353000 
CG 30.930000 
HG1 2.690000 
CE 14.414000 
HE# 1.929000 
END_RES_DEF 

RES_ID 769 
RESJTYPE ASP 
SPIN_SYSTEM_ID 55 
HETEROGENE I TY 100 

N 119.060000 

HN 7.365000 

CA 53.516000 

HA 4.745000 

CB 44.154000 

HB1 2.371000 
END_RES_DEF 

RES_ID 770 
RESJTYPE LEU 
SPIN_SYSTEM_ID 56 
HETEROGENEITY 100 

N 116.272000 

HN 9.055000 

CA 57.922000 

HA 4.036000 

CB 41.400000 

HB1 2.095000 

HB2 1.395000 

CG 27.080000 

HG 1.713000 

CD1 27.080000 

HD1# 0.940000 

CD2 22.675000 

HD2# 0.628000 
END_RES_DEF 

RES_ID 771 
RESJTYPE LYS 
SPIN_SYSTEM_ID 57 
HETEROGENEITY 100 

N 128.079000 

HN 8.738000 

CA 60.676000 

HA 4.198000 

CB 32.037000 

HB1 2.330000 

HB2 2.224000 

CG 25.280000 

HG1 1.483000 

HG2 1.403000 

CD 30.385000 

HD1 1.793000 

HD2 1.696000 

CE 41.950000 

HE1 2.965000 



END_RES_DEF 

RES_ID 772 
RESJTYPE THR 
SPIN_SYSTEM_ID 58 
HETEROGENEITY 100 
N 122.176000 
HN 9.445000 
CA 67.040000 
HA 3.845000 
CB 67.835000 
HB 4.090000 
CG2 22.124000 
HG2# 1.058000 
END_RES_DEF 

RES_ID 773 
RESJTYPE MET 
S P I N_S YSTEM_ ID 5 9 
HETEROGENEITY 100 
N 117.912000 
HN 7.882000 
CA 60.676000 
HA 4.319000 
CB 33.342000 
HB1 2.093000 
HB2 1.915000 
CG 33.139000 
HG1 2 .621000 
HG2 2.496000 
CE 16.620000 
HE# 1.241000 
END_RES_DEF 

RES_ID 774 
RESJTYPE SER 
SPIK_SYSTEM_ID 60 
HETEROGENE I TY 100 
N 116.108000 
HN 7.958000 
CA 62.879000 
HA 4.200000 
CB 62.879000 
HB1 4.368000 
HB2 4.040000 
END_RES_DEF 

RES_ID 775 
RESJTYPE GLU 
SPIN_SYSTEM_ID 61 
HETEROGENE ITY 100 

N 124.471000 

HN 3.150000 

CA 59.570000 

HA 4.045000 

CB 29.280000 

HB1 2.246000 

HB2 2.063000 

CG 36.443000 

HG1 2.345000 

HG2 2.176000 
END_RES_DEF 

RES_ID 776 
RESJTYPE ARG 
SPIN_SYSTEM_ID 62 
HETEROGENEITY 100 

N 120.372000 

HN 8.391000 

CA 60.676000 

HA 3.869000 

CB 30.385000 

HB1 2 .047000 

HB2 1.076000 

CG 29.284000 

HG1 1.722000 

HG2 0.877000 

CD 44 . 154000 

HD1 2.578000 

HD2 2.051000 
END_RES_DEF 

RES_ID 777 
RESJTYPE LEU 
SPIN_SYSTEM_ID 63 
HETEROGENE I TY 100 

N 120.208000 

HN 8.856000 

CA 58.470000 

HA 4.691000 

CB 42.621000 

HB1 2.295000 

HB2 1.925000 

CG 27.080000 

HG 1.832000 



CD1 25 .429000 
HD1# 1.067000 
CD2 27.081000 
HD2# 0.871000 
END_SES__DEF 

RES_ID 778 
RESJTYPE LYS 
SPIN_SYSTEM_ID 64 
HETEROGENE I TY 100 
N 120.372000 
HN 7.958000 
CA 59.574000 
HA 4.333000 
CB 32.588000 
HB1 2.055000 
CG 24.878000 
HG1 1.596000 
CD 29.835000 
HD1 1.804000 
CE 41.951000 
HE1 2.990000 
END_RES_DEF 

RES_ID 779 
RESJTYPE ASN 
SPIN_SYSTEM_ID 65 
HETEROGENEITY 100 
N 116.108000 
HN 7.947000 
CA 53.510000 
HA 4.771000 
CB 38.095000 
HB1 3.019000 
HB2 2.773000 
ND2 112.665000 
HD21 7.598000 
HD22 6.969000 
END_RES_DEF 

RES_ID 780 
RESJTYPE ARG 
SPIN_SYSTEM_ID 66 
HETEROGENEITY 100 
N 114.141000 
HN 8.158000 
CA 56.821000 
HA 4.405000 
CB 25 .429000 
HB1 2.097000 
HB2 2.022000 
CG 27.632000 
HG1 1.539000 
HG2 1.534000 
CD 43.050000 
HD1 3. 060000 
HD2 3.024000 
ENDJRES_DEF 

RES_ID 781 
RESJTYPE TYR 
SPIN_SYSTEM_ID 67 
HETEROGENEITY 100 

N 116.764000 

HN 8.222000 

CA 60.125000 

HA 4.064000 

CB 40.850000 

HB1 2.948000 

HB2 2.055000 

CD1 134.350000 

HD1 6.285000 

CE1 118.930000 

HE1 6.709000 
END_RES_DEF 

RES_ID 782 
RESJTYPE TYR 
SPIN_SYSTEM_ID 68 
HETEROGENEITY 100 

N 114.633000 

HN 8.014000 

CA 57.920000 

HA 4.528000 

CB 36.443000 

HB1 3.062000 

HB2 2.907000 

CD1 133.248000 

HD1 7.175000 

CE1 120.582000 

HS1 7.286000 
END_RES DEF 



RES_ID 
RES~TYPE 



783 
VAL 



SPIN_SYSTEM_ID 69 
HETEROGENEITY 100 
N 115.780000 
HN 7.698000 
CA 62.330000 
HA 4.083000 
CB 31.500000 
HB 2.321000 
CGI 21.570000 
HG1# 0.944000 
CG2 18.820000 
HG2# 0.823000 
END_RES_DEF 

RES_ID 784 
RESJTYPE SER 
SPIN_SYSTEM_ID 70 
HETEROGENEITY 100 
N 111.353000 
HN 7.415000 
CA 55.719000 
HA 4.741000 
CB 66.183000 
HB1 4.200000 
HB2 3.750000 
END_RES_DEF 

RES_ID 785 
RESJTYPE LYS 
SPIN_SYSTEM_ID 71 
HETEROGENEITY 100 
CA 59.030000 
HA 4.021000 
CB 31.59000O 
END_RES_DEF 

RES_ID 786 
RESJTYPE LYS 
SPIN_SYSTEM_ID 72 
HETEROGENEITY 100 
N 120.208000 
HN 8.244000 
CA 59.720000 
HA 4.062000 
CB 30.385000 
HB1 1. 779000 
CG 24.530000 f 
CD 28.182000 ~* 
HD1 1.680000 
CE 41.670000 
HE1 3.137000 
HE2 3.045000 
END_RES_DEF 

RES_ID 787 
RESJTYPE LEU 
SPIN_SYSTEM_ID 73 
HETEROGENEITY 100 

N 118.732000 

HN 7. 422000 

CA 57.922000 

HA 4,213000 

CB 43.603000 

HB1 1.996000 

HB2 1.891000 

CG 27.632000 

HG 1.794000 

CD1 25.979000 

HD1# 0.924000 

CD2 23.776000 

HD2# 0.895000 
END_RES_DEF 

RES_ID 788 
RESJTYPE PHE 
SPIN_SYSTEM_ID 74 
HETEROGENE I TY 100 

N 118.732000 

HN 6.928000 

CA 60.676000 

HA 3.763000 

CB 39.750000 

HB1 2.945000 

HB2 2.381000 

CD1 133.799000 

HD1 6.400000 

CE1 131.596000 

HE1 6.928000 
END_RES_DEF 

RES_ID 789 
RESJTYPE MET 
SPIN_SYSTEM_ID 75 
HETEROGENEITY 100 
N 116.272000 
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HN 8.489000 
CA 59.020000 
HA 3.911000 
CB 32 590000 
HB1 2.318000 
HB2 2.208000 
CG 33.140000 
HG1 2.942000 
HG2 2 .611000 
CE 17.168000 
HE# 2.027000 
END_RES_DEF 

RES_ID 790 
RESJTYPE ALA 
SPIN_SYSTEM_ID 76 
HETEROGENEITY 100 
N 119.716000 
HN 8.000000 
CA 55.170000 
HA 4.084000 
CB 18.270000 
HB# 1.485000 
ElTO_RES_DEF 

RES_ID 791 
RESJTYPE ASP 
SPIN_SYSTEM_ID 77 
HETEROGENEITY 100 
N 119.716000 
HN 7.376000 
CA 57.371000 
HA 4.371000 
CB 3S. 646000 
HB1 2.730000 
END_RES_DEF 

RES_ID 792 
RESJTYPE LEU 
SPIN_SYSTEM_ID 78 
HETEROGENEITY 100 
N 119.550000 
HN 7.363000 
CA 57.922000 
HA 3.398000 
CB 40.299000 
HB1 0.757000 
HB2 0.442000 
CG 27.632000 
HG 0.707000 
CD1 24.327000 
HD1# 0.184000 
CD2 25.979000 
HD2# 0.061000 
END_RES_DEF 

RES_ID 793 
RESJTYPE GLN 
SPIN_SYSTEM_ID 79 
HETEROGENEITY 100 
H 114 .141000 
HN 8.069000 
CA 59.024000 
HA 3.804000 
CB 28.733000 
HB1 2.157000 
HB2 2.097000 
CG 35.342000 
HG1 2.460000 
NE2 111.353000 
HE21 7.319000 
HE22 7.222000 
END_RES_DEF 

RES_ID 794 
RESJTYPE ARG 
SP]N_SYSTEM_ID 80 
HETEROGENEITY 100 

N 118.568000 

HN 7.382000 

CA 58.473000 

HA 4.078000 

CB 29.835000 

HB1 1.973000 

HB2 1.886000 

CG 27.080000 

HG1 1.742000 

CD 43.603000 

HD1 3.390000 

HD2 3.325000 
END_RES_DEF 

RES_ID 795 
RESJTYPE VAL 
SPIN SYSTEM ID 81 



HETEROGENE ITY 100 
N 117.912000 
HN 7.013000 
CA 66.730000 
HA 3.039000 
CB 30.930000 
HB 1.435000 
CGI 22.124000 
HG1# 0.479000 
CG2 21.573000 
HG2# 0.142000 

END_RES_DEF 

RES_ID 796 
RESJTYPE PHE 
SPIN_SYSTEM_ID 82 
HETEROGENE I TY 100 
N 116.928000 
HN 6.357000 
CA 58.470000 
HA 4.161000 
CB 38.096000 
HB1 3.090000 
HB2 2.944000 
CD1 132.147000 
HD1 6.641000 
CE1 131.596000 
HE1 6.456000 
CZ 129.393000 
HZ 6.406000 
END_RES_DEF 

RES_ID 797 
RESJTYPE THR 
SPIN_SYSTEM_ID 83 
HETEROGENEITY 100 
N 115.289000 
HN 9.047000 
CA 66.734000 
HA 3.838000 
CB 68.380000 
HB 4.210000 
CG2 22.120000 
HG28 1.296000 
END_RES_DEF 

RES_ID 798 
RESJTYPE ASN 
SPIN_SYSTEM_ID 84 
HETEROGENEITY 100 
N 120.700000 
HN 8.846000 
CA 55.170000 
HA 4.315000 
CB 38.090000 
HB1 2.985000 
HB2 2.661000 
END_RES_DEF 

RES_ID 799 
RESJTYPE CYS 
SPIN_SYSTEM_ID 85 
HETEROGENEITY 100 

N 116.928000 

HN 6.893000 

CA 62.157000 

HA 4.405000 

CB 26.530000 

HB1 3 .304000 

HB2 3.032000 
END_RES_DEF 

RES_ID 8 00 

RESJTYPE LYS 
SPIN_SYSTEM_ID 86 
HETEROGENEITY 100 

N 116.764000 

HN 7.799000 

CA 58.473000 

HA 4.204000 

CB 32.588000 

HB1 1.743000 

CG 25.429000 

HG1 1.313000 

HG2 0.138000 

CD 29.835000 

HD1 1.291000 

CE 41.400000 

HE1 2.486000 

HE2 2.421000 
END_RES_DE F 

RES_ID 801 
RESJTYPE GLU 
SPIN SYSTEM ID 87 



HETEROGENEITY 100 
N 117.912000 
HN 7.945000 
CA 57.992000 
HA 4.250000 
CB 30.385000 
HB1 2.172000 
HB2 2 . 003000 
CG 36.994000 
HG1 2.407000 
HG2 2.203000 

END_RES_DEF 

RES_ID 802 
RESJTYPE TYR 
SPIN_SYSTEM_ID 88 
HETEROGENEITY 100 
N 116.600000 
HN 7.744000 
CA 60.676000 
HA 4.369000 
CB 41.400000 
HB1 2.929000 
CD1 134.901000 
HD1 6.989000 
CE1 119.481000 
HE1 6.823000 
END_RES_DEF 

RES_ID 803 
RESJTYPE ASN 
SPIN_SYST£M__ID 89 
HETEROGENEITY 100 
N 115.944000 
HN 8.241000 
CA 51.864000 
HA 5-024000 
CB 40.849000 
HB1 3.069000 
HB2 2.907000 
ND2 118.732000 
HD21 8.316000 
HD22 7.809000 
END_RES_DEF 

RES_ID 804 

RESJTYPE ALA 

SPIN_SYSTEM_ID 90 

HETEROGENE ITY 100 
END_RES_DEF 

RES^ID 805 
RESJTYPE PRO 
SPIN_SYSTEM_ID 91 
HETEROGENEITY 100 

CA 63.980000 

HA 2.422000 

HB1 1.949000 

HG1 1.648000 

HG2 1.558000 

CD 50.762000 

HD2 3.601000 

HD1 3.706000 
END_RES_DEF 

RES_ID 806 
RESJTYPE GLU 
SPIN_SYSTEM_ID 92 
HETEROGENEITY 100 

N 112.993000 

HN 8.246000 

CA 56.820000 

HA 4.185000 

CB 28.733000 

HB1 2.095000 

HB2 1.973000 

CG 36.270000 

HG1 2.200000 
END_RES_DEF 

RES_ID 807 
RESJTYPE SER 
SPIN_SYSTEM_ID 93 
HETEROGENEITY 100 

N 115.780000 

HN 8.112000 

CA 58.473000 

HA 4.406000 

CB 66.183000 

HB1 4.393000 

HB2 4.157000 
END RES DEF 



RES_ID 
RES TYPE 



808 
GLU 



SPIN_SYSTEM_ID 94 
HETEROGENE ITY 100 
N 123.488000 
HN 9.061000 
CA 59.574000 
HA 4.232000 
CB 29.835000 
HB1 2.169000 
CG 36.443000 
HG1 2.528000 
END_RES_DEF 

RES_ID 809 
RESJTYPE TYR 
SPIN_SYSTEM_ID 95 
HETEROGENEITY 100 
N 116.436000 
HN 8.072000 
CA 60.120000 
HA 3.834000 
CB 37.550000 
HB1 3.018000 
HB2 2.738000 
CD1 132.698000 
HD1 6.891000 
CE1 120.032000 
HE1 7.011000 
END_RES_DEF 

RES_ID 810 
RESJTYPE TYR 
SPIN__SYSTEM_ID 96 
HETEROGENEITY 100 
N 119.880000 
HN 7.356000 
CA 61.777000 
HA 3.819000 
CB 40.300000 
HB1 3 .390000 
HB2 2.500000 
CD1 136.553000 
HD1 7.094000 
CE1 119 .481000 
HE1 7.000000 
END_RES_DEF 

RES_ID SliT 
RESJTYPE LYS 
SPIN_SYSTEM_ID 97 
HETEROGENE I TY 100 
N 118.076000 
HN 8.072000 
CA 60.676000 
HA 4.204000 
CB 32.588000 
HB1 2.091000 
CG 25.979000 
HG1 1.819000 
HG2 1.582000 
CD 29.834000 
HD1 1.813000 
CE 41.963000 
HE1 2.962000 
END_RES_DEF 

RES_ID 812 
RESJTYPE CYS 
SPIN_SYSTEM_ID 98 
HETEROGENEITY 100 

N 116.764000 

HN 8.520000 

CA 65.087000 

HA 4.202000 

CB 27.080000 

HB1 3.396000 

HB2 3.056000 
END_RES_DEF 

RES_ID 813 
RES_TYPE ALA 
SPIN_SYSTEM_ID 99 
HETEROGENEITY 100 

N 120.700000 

HN 8. 315000 

CA 55.563000 

HA 3.834000 

CB 18.270000 

HBff 1.597000 
END_RES_DEF 

RES_ID 814 
RESJTYPE ASN 
SPIN_SYSTEM_ID 100 
HETEROGENE ITY 100 
N 115.453000 
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HN 8.068000 
CA 56.270000 
HA 4.329000 
CB 38.646000 
HB1 2.877000 
HB2 2.834000 
END_RES_DEF 

RES_ID 815 
RESJTYPE ILE 
SPIN_SYSTEM_ID 101 
HETEROGENEITY 100 
N 119.880000 
HN 7.912000 
CA 65.080000 
HA 3.646000 
CB 39.197000 
HB 1.924000 
CGI 29.284000 
HG11 1.882000 
HG12 1.201000 
CG2 17.718000 
HG2Jf 1.017000 
* CD1 13.863000 
HD1# 0.940000 
END_RES_DEF 

RES_ID 816 
RES_TYPE LEU 
SPIN_SYSTEM_ID 102 
HETEROGENE ITY 100 
N 122.504000 
HN 8.556000 
CA 56.820000 
HA 3.670000 
CB 41.951000 
HB1 1.405000 
HB2 1.199000 
CG 26.530000 
HG 1.580000 
CD1 24.327000 
HD1# 0.701000 
CD2 25.429000 
HD2# 0.696000 
END RES DEF 



RES_ID 
RES_TYPE 

SPIN_SYSTEM_ID 103 

HETEROGENE ITY 10 0 



817 
GLU 



N 117.584000 
HN 7.145000 
CA 59.688000 
HA 4.075000 
CB 32.588000 
HB1 1.929000 
CG 25.644000 
HG1 1.492000 
CD 29.284000 
HD1 1.681000 
CE 41.963000 
HS1 2.964000 
END_RES_DEF 

RES_ID 819 
RES_TYPE PHE 
SPIN_SYSTEM_ID 105 
HETEROGENEITY 100 
N 121.028000 
HN 7.8G9000 
CA 61.230000 
HA 4.328000 
CB 39.200000 
HB1 3.133000 
HB2 3.047000 
CD1 133.800000 
HD1 7.180000 
END RES DEF 



RES_ID 820 
RESJTYPE PHE 
SPIN_SYSTEM_ID 106 
HETEROGENEITY 100 
N 120.700000 
HN 9.126000 
CA 60.691000 
HA 3.961000 
CB 38.640000 
HB1 3.289000 
HB2 3.067000 
CD1 133.248000 
HD1 6.904000 
CE1 132.698000 
HE1 7.011000 
END_RES_DEF 

RES_ID 821 
RES_TYPE PHE 
SPIN_SYSTEMJ D 107 
HETEROGENEITY 100 
N 118.076000 
HN 8.359000 
CA 61.770000 
HA 3.840000 
CB 38.090000 
HB1 3.064000 
CD1 133.248000 
HD1 7.175000 
CE1 132.698000 
HE1 7.294000 
CZ 131.596000 
HZ 7.430000 
END_RES_DEF 

RES_ID 822 
RES_TYPE SER 
SPIN_SYSTEM_ID 108 
HETEROGENEITY 100 

N 114.961000 

HN 7.906000 

CA 61.773000 

HA 4.200000 

CB 62.879000 

HB1 4.007000 
END RES DEF 



RES_ID 
RES TYPE 



823 
LYS 



N 


120.700000 




HETEROGENEITY 


HN 


8.073000 




N 


120.864000 


CA 


60.125000 




HN 


7.938000 


HA 


3.185000 




CA 


56.820000 


CB 


29.835000 




HA 


4.008000 


HB1 


1.720000 




CB 


31.487000 


HB2 


1.310000 




HB1 


1.730000 


CG 


37.545000 




HB2 


1. 567000 


HG1 


2.001000 




CG 


23 .226000 


HG2 


1.922000 




HG1 


0.833000 


END_RES_DEF 




CD 


27 . 080000 








HD1 


1.403000 


RES_ID 


818 


CE 


42.501000 


RES_TYPE 


LYS 


HE1 


2.569000 


SPIN_SYSTEM_ID 


104 


HE2 


2.422000 


HETEROGENEITY 


100 


END_RES DEF 



RES_ID 824 
RES_TYPE ILE 
SPIH_SYSTEM_ID 110 
HETEROGENEITY 100 
N 116.928000 
HN 8.101000 
CA 64.530000 
HA 3.818000 
CB 36.990000 
HB 1.746000 
CGI 26.530000 
HG11 1.140000 
HG12 1.073000 
CG2 18.820000 
HG2# 0.654000 
CD1 13.312000 
HD1# 0.541000 
END_RES_DEF 

RES_ID 825 
RESJTYPE LYS 
SPIN_SYSTEM_ID 111 
HETEROGENEITY 100 
N 122.176000 
HN 7.546000 
CA 59.024000 
HA 4.043000 
CB 32.360000 



HB1 1.879000 
HB2 1.757000 
CG 24.878000 
HG1 1. 390000 
HG2 1.302000 
CD 29.284000 
HD1 1.633000 
CE 41.400000 
HE1 2.913000 
END_RES_DEF 

RES_ID 826 
RESJTYPE GLU 
SPIN_SYSTEM_ID 112 
HETEROGENEITY 100 
N 121.192000 
HN 8.063000 
CA 59.024000 
HA 3.995000 
CB 29.834000 
HB1 2.058000 
CG 36.050000 
HG1 2.34200O 
HG2 2.205000 
END_RES_DEF 

RES_ID 827 
RESJTYPE ALA 
SPIN_SYSTEM_ID 113 
HETEROGENEITY 100 
N 117.748000 
HN 7.620000 
CA 52.410000 
HA 4.291000 
CB 19.920000 
HBU 1.358000 
END_RES_DEF 

RES_ID 828 
RESJTYPE GLY 
SPIN_SYSTEM_ID 114 
HETEROGENEITY 100 
N 126.767000 
HN 7.744000 
CA 45.902000 
HA1 4.019000 
HA2 3.935000 
END_RES_DEF 

RES_ID 829 
RESJTYPE LEU 
SPIN_SYSTEM_ID 115 
HETEROGENEITY 100 
N 117.912000 
HN 7.742000 
CA 55.719000 
HA 4.215000 
CB 43.052000 
HB1 1.562000 
CG 27.632000 
HG 1.536000 
CD1 23.776000 
HD1# 0.711000 
END_RES_DEF 

RES_ID 830 
RESJTYPE ILE 
SPIN_SYSTEM_ID 116 
HETEROGENEITY 100 
N 115 .453000 
HN 7.458000 
CA 60.676000 
HA 4.232000 
CB 39.748000 
HB 1.810000 
CGI 27.080000 
HG11 1.314000 
HG12 0.918000 
CG2 17.718000 
HG2# 0.815000 
CD1 13.312000 
HD1# 0.794000 
END_RES_DEF 

RES_ID 831 
RESJTYPE ASP 
SPIN_SYSTEM_ID 117 
HETEROGENEITY 100 

N 123.488000 

HN 8.270000 

CA 54.620000 

HA 4.571000 

CB 41.400000 

HB1 2.693000 

HB2 2.540000 



END_RES_DEF 

RES_ID 832 
RESJTYPE LYS 
SPIN_SYSTEM_ID 118 
HETEROGENEITY 100 
N 125.450000 
HN 7.774000 
CA 57.720000 
HA 4.082000 
CB 33.410000 
END RES DEF 
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Figure 2A-2D 
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Three-Dimensional Structure of the P/CAF Bromo domain 




Figure 2E 
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Figure 2F-2G 
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Figure 2H 
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<130> 2459-1-003 

<140> UNAS S I GNED 
<141> 2000-02-22 

<160> 44 

<170> Patentln Ver. 2.0 

<210> 1 

<211> 3014 

<212> DNA 

<213> Homo sapiens 

<400> 1 

ggggccgcgt cgacgcggaa aagaggccgt ggggggcctc ccagcgctgg cagacaccgt 60 
gaggctggca gccgccggca cgcacaccta gtccgcagtc ccgaggaaca tgtccgcagc 12 0 
cagggcgcgg agcagagtcc cgggcaggag aaccaaggga gggcgtgtgc tgtggcggcg 18 0 
gcggcagcgg cagcggagcc gctagtcccc tccctcctgg gggagcagct gccgccgctg 2 40 
ccgccgccgc caccaccatc agcgcgcggg gcccggccag agcgagccgg gcgagcggcg 3 00 
cgctaggggg agggcggggg cggggagggg ggtgggcgaa gggggcggga gggcgtgggg 3 60 
ggagggtctc gctctcccga ctaccagagc ccgagggaga ccctggcggc ggcggcggcg 42 0 
cctgacactc ggcgcctcct gccgtgctcc ggggcggcat gtccgaggct ggcggggccg 480 
ggccgggcgg ctgcggggca ggagccgggg caggggccgg gcccggggcg ctgcccccgc 540 
agcctgcggc gcttccgccc gcgcccccgc agggctcccc ctgcgccgct gccgccgggg 600 
gctcgggcgc ctgcggtccg gcgacggcag tggctgcagc gggcacggcc gaaggaccgg 660 
gaggcggtgg ctcggcccga atcgccgtga agaaagcgca actacgctcc gctccgcggg 72 0 
ccaagaaact ggagaaactc ggagtgtact ccgcctgcaa ggccgaggag tcttgtaaat 780 
gtaatggctg gaaaaaccct aacccctcac ccactccccc cagagccgac ctgcagcaaa 840 
taattgtcag tctaacagaa tcctgtcgga gttgtagcca tgccctagct gctcatgttt 900 
cccacctgga gaatgtgtca gaggaagaaa tgaacagact cctgggaata gtattggatg 960 
tggaatatct ctttacctgt gtccacaagg aagaagatgc agataccaaa caagtttatt 102 0 
tctatctatt taagctcttg agaaagtcta ttttacaaag aggaaaacct gtggttgaag 108 0 
gctctttgga aaagaaaccc ccatttgaaa aacctagcat tgaacagggt gtgaataact 1140 
ttgtgcagta caaatttagt cacctgccag caaaagaaag gcaaacaata gttgagttgg 1200 
caaaaatgtt cctaaaccgc atcaactatt ggcatctgga ggcaccatct caacgaagac 1260 
tgcgatctcc caatgatgat atttctggat acaaagagaa ctacacaagg tggctgtgtt 132 0 
actgcaacgt gccacagttc tgcgacagtc tacctcggta cgaaaccaca caggtgtttg 13 8 0 
ggagaacatt gcttcgctcg gtcttcactg ttatgaggcg acaactcctg gaacaagcaa 1440 
gacaggaaaa agataaactg cctcttgaaa aacgaactct aatcctcact catttcccaa 1500 
aatttctgtc catgctagaa gaagaagtat atagtcaaaa ctctcccatc tgggatcagg 1560 



1 



attttctctc agcctcttcc agaaccagcc agctaggcat ccaaacagtt atcaatccac 162 0 
ctcctgtggc tgggacaatt tcatacaatt caacctcatc ttcccttgag cagccaaacg 1680 
cagggagcag cagtcctgcc tgcaaagcct cttctggact tgaggcaaac ccaggagaaa 1740 
agaggaaaat gactgattct catgttctgg aggaggccaa gaaaccccga gttatggggg 1800 
atattccgat ggaattaatc aacgaggtta tgtctaccat cacggaccct gcagcaatgc 18 6 0 
ttggaccaga gaccaatttt ctgtcagcac actcggccag ggatgaggcg gcaaggttgg 192 0 
aagagcgcag gggtgtaatt gaatttcacg tggttggcaa ttccctcaac cagaaaccaa 1980 
acaagaagat cctgatgtgg ctggttggcc tacagaacgt tttctcccac cagctgcccc 2 040 
gaatgccaaa agaatacatc acacggctcg tctttgaccc gaaacacaaa acccttgctt 2100 
taattaaaga tggccgtgtt attggtggta tctgtttccg tatgttccca tctcaaggat 2160 
tcacagagat tgtcttctgt gctgtaacct caaatgagca agtcaagggc tatggaacac 222 0 
acctgatgaa tcatttgaaa gaatatcaca taaagcatga catcctgaac ttcctcacat 22 80 
atgcagatga atatgcaatt ggatacttta agaaacaggg tttctccaaa gaaattaaaa 2340 
tacctaaaac caaatatgtt ggctatatca aggattatga aggagccact ttaatgggat 240 0 
gtgagctaaa tccacggatc ccgtacacag aattttctgt catcattaaa aagcagaagg 2460 
agataattaa aaaactgatt gaaagaaaac aggcacaaat tcgaaaagtt taccctggac 252 0 
tttcatgttt taaagatgga gttcgacaga ttcctataga aagcattcct ggaattagag 2580 
agacaggctg gaaaccgagt ggaaaagaga aaagtaaaga gcccagagac cctgaccagc 2 640 
tttacagcac gctcaagagc atcctccagc aggtgaagag ccatcaaagc gcttggccct 27 00 
tcatggaacc tgtgaagaga acagaagctc caggatatta tgaagttata aggttcccca 27 60 
tggatctgaa aaccatgagt gaacgcctca agaataggta ctacgtgtct aagaaattat 2 82 0 
tcatggcaga cttacagcga gtctttacca attgcaaaga gtacaacgcc gctgagagtg 2 88 0 
aatactacaa atgtgccaat atcctggaga aattcttctt cagtaaaatt aaggaagctg 2 940 
gattaattga caagtgattt tttttccccc tctgcttctt agaaactcac caagcagtgt 3000 
gcctaaagca aggt 3 014 

<210> 2 
<211> 832 
<212> PRT 

<213> Homo sapiens 
<400> 2 

Met Ser Glu Ala Gly Gly Ala Gly Pro Gly Gly Cys Gly Ala Gly Ala 
15 10 15 

Gly Ala Gly Ala Gly Pro Gly Ala Leu Pro Pro Gin Pro Ala Ala Leu 
20 25 30 

Pro Pro Ala Pro Pro Gin Gly Ser Pro Cys Ala Ala Ala Ala Gly Gly 
35 40 45 

Ser Gly Ala Cys Gly Pro Ala Thr Ala Val Ala Ala Ala Gly Thr Ala 
50 55 60 

Glu Gly Pro Gly Gly Gly Gly Ser Ala Arg lie Ala Val Lys Lys Ala 
65 70 75 80 

Gin Leu Arg Ser Ala Pro Arg Ala Lys Lys Leu Glu Lys Leu Gly Val 



2 



85 



90 



95 



Tyr Ser Ala Cys 
100 

Asn Pro Asn Pro 
115 

lie Val Ser Leu 
130 

Ala His Val Ser 
145 

Leu Leu Gly lie 



Lys Glu Glu Asp 
180 

Leu Leu Arg Lys 
195 

Ser Leu Glu Lys 
210 

Val Asn Asn Phe 
225 

Arg Gin Thr lie 



Tyr Trp His Leu 
260 

Asp Asp lie Ser 
275 

Cys Asn Val Pro 
290 

Gin Val Phe Gly 
305 

Arg Gin Leu Leu 



Glu Lys Arg Thr 



Lys Ala Glu Glu 



Ser Pro Thr Pro 
120 

Thr Glu Ser Cys 
135 

His Leu Glu Asn 
150 

Val Leu Asp Val 
165 

Ala Asp Thr Lys 



Ser lie Leu Gin 
200 

Lys Pro Pro Phe 
215 

Val Gin Tyr Lys 
230 

Val Glu Leu Ala 
245 

Glu Ala Pro Ser 



Gly Tyr Lys Glu 
280 

Gin Phe Cys Asp 
295 

Arg Thr Leu Leu 
310 

Glu Gin Ala Arg 
325 

Leu lie Leu Thr 



Ser Cys Lys Cys 
105 

Pro Arg Ala Asp 



Arg Ser Cys Ser 
140 

Val Ser Glu Glu 
155 

Glu Tyr Leu Phe 
170 

Gin Val Tyr Phe 
185 

Arg Gly Lys Pro 



Glu Lys Pro Ser 
220 

Phe Ser His Leu 
235 

Lys Met Phe Leu 
250 

Gin Arg Arg Leu 
265 

Asn Tyr Thr Arg 



Ser Leu Pro Arg 
300 

Arg Ser Val Phe 
315 

Gin Glu Lys Asp 
330 

His Phe Pro Lys 



Asn Gly Trp Lys 
110 

Leu Gin Gin lie 
125 

His Ala Leu Ala 



Glu Met Asn Arg 
160 

Thr Cys Val His 
175 

Tyr Leu Phe Lys 
190 

Val Val Glu Gly 
205 

lie Glu Gin Gly 



Pro Ala Lys Glu 
240 

Asn Arg lie Asn 
255 

Arg Ser Pro Asn 
270 

Trp Leu Cys Tyr 
285 

Tyr Glu Thr Thr 



Thr Val Met Arg 
320 

Lys Leu Pro Leu 
335 

Phe Leu Ser Met 
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340 



345 



350 



Leu Glu Glu Glu 

355 

Phe Leu, Ser Ala 
370 

lie Asn Pro Pro 
385 

Ser Ser Leu Glu 



Ala Ser Ser Gly 
420 

Asp Ser His Val 
435 

lie Pro Met Glu 
450 

Ala Ala Met Leu 
465 

Arg Asp Glu Ala 



His Val Val Gly 
500 

Met Trp Leu Val 
515 

Met Pro Lys Glu 
530 

Thr Leu Ala Leu 
545 

Arg Met Phe Pro 



Thr Ser Asn Glu 
580 

Leu Lys Glu Tyr 



Val Tyr Ser Gin 
360 

Ser Ser Arg Thr 
375 

Pro Val Ala Gly 
390 

Gin Pro Asn Ala 
405 

Leu Glu Ala Asn 



Leu Glu Glu Ala 
440 

Leu lie Asn Glu 
455 

Gly Pro Glu Thr 
470 

Ala Arg Leu Glu 
485 

Asn Ser Leu Asn 



Gly Leu Gin Asn 
520 

Tyr lie Thr Arg 
535 

lie Lys Asp Gly 
550 

Ser Gin Gly Phe 
565 

Gin Val Lys Gly 



His lie Lys His 



Asn Ser Pro lie 



Ser Gin Leu Gly 
380 

Thr lie Ser Tyr 
395 

Gly Ser Ser Ser 
410 

Pro Gly Glu Lys 
425 

Lys Lys Pro Arg 



Val Met Ser Thr 
460 

Asn Phe Leu Ser 
475 

Glu Arg Arg Gly 
490 

Gin Lys Pro Asn 
505 

Val Phe Ser His 



Leu Val Phe Asp 
540 

Arg Val lie Gly 
555 

Thr Glu lie Val 
570 

Tyr Gly Thr His 
585 

Asp lie Leu Asn 



Trp Asp Gin Asp 
365 

lie Gin Thr Val 



Asn Ser Thr Ser 
400 

Pro Ala Cys Lys 
415 

Arg Lys Met Thr 
430 

Val Met Gly Asp 
445 

lie Thr Asp Pro 



Ala His Ser Ala 
480 

Val lie Glu Phe 
495 

Lys Lys lie Leu 
510 

Gin Leu Pro Arg 
525 

Pro Lys His Lys 



Gly lie Cys Phe 
560 

Phe Cys Ala Val 
575 

Leu Met Asn His 
590 

Phe Leu Thr Tyr 



4 



595 



60 0 



605 



Ala Asp Glu Tyr Ala lie Gly Tyr Phe Lys Lys Gin Gly Phe Ser Lys 
610 615 620 

Glu lie Lys lie Pro Lys Thr Lys Tyr Val Gly Tyr lie Lys Asp Tyr 
625 630 635 640 

Glu Gly Ala Thr Leu Met Gly Cys Glu Leu Asn Pro Arg lie Pro Tyr 
645 650 655 

Thr Glu Phe Ser Val lie lie Lys Lys Gin Lys Glu lie lie Lys Lys 
660 665 670 

Leu lie Glu Arg Lys Gin Ala Gin lie Arg Lys Val Tyr Pro Gly Leu 
675 680 685 

Ser Cys Phe Lys Asp Gly Val Arg Gin lie Pro lie Glu Ser lie Pro 
690 695 700 

Gly lie Arg Glu Thr Gly Trp Lys Pro Ser Gly Lys Glu Lys Ser Lys 
705 710 715 720 

Glu Pro Arg Asp Pro Asp Gin Leu Tyr Ser Thr Leu Lys Ser lie Leu 
725 730 735 

Gin Gin Val Lys Ser His Gin Ser Ala Trp Pro Phe Met Glu Pro Val 
740 745 750 

Lys Arg Thr Glu Ala Pro Gly Tyr Tyr Glu Val lie Arg Phe Pro Met 
755 760 765 

Asp Leu Lys Thr Met Ser Glu Arg Leu Lys Asn Arg Tyr Tyr Val Ser 
770 775 780 

Lys Lys Leu Phe Met Ala Asp Leu Gin Arg Val Phe Thr Asn Cys Lys 
785 790 795 800 

Glu Tyr Asn Ala Ala Glu Ser Glu Tyr Tyr Lys Cys Ala Asn lie Leu 
805 810 815 

Glu Lys Phe Phe Phe Ser Lys lie Lys Glu Ala Gly Leu lie Asp Lys 
820 825 830 
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<210> 3 
<211> 12 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: peptide 
<220> 

<221> VARIANT 
<222> (2) 

<223> It represents 2 to 3 undesignated amino acids. 
They can be any amino acids . 

<220> 

<221> VARIANT 
<222> (4) 

<223> It represents 5 to 8 undesignated amino acids. 
They can be any amino acids . 

<220> 

<221> VARIANT 
<222> (6) 

<2 23> It represents one undesignated amino acid. It can 
be any amino acid. 

<220> 

<221> VARIANT 
<222> (9) 

<223> It represents 5 undesignated amino acids. They can 
be any amino acids . 

<220> 

<221> VARIANT 
<222> (5) 

<223> It can be any amino acid from the group of: P, K, 
or H. 

<220> 

<221> VARIANT 
<222> (8) 

<223> It can be any amino acid from the group of: Y, F, 
or H. 

<220> 

<221> VARIANT 
<222> (11) 

<2 23> It can be any amino acid from the group of: M, I, 
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or V. 



<400> 3 

Phe Xaa Pro Xaa Xaa Xaa Tyr Xaa Xaa Pro Xaa Asp 
15 10 



<210> 4 
<211> 12 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: peptide 
<220> 

<221> SITE 
<222> (6) 

<223> It is acetyl - lys ine . 
<400> 4 

lie Ser Tyr Gly Arg Xaa Lys Arg Arg Gin Arg Arg 
15 10 



<210> 5 
<211> 14 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: peptide 
<220> 

<221> SITE 
<222> (8) 

<223> It is acetyl-lysine. 
<400> 5 

Ala Arg Lys Ser Thr Gly Gly Xaa Ala Pro Arg Lys Gin Leu 
15 10 



<210> 6 
<211> 14 
<212> PRT 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: peptide 
<220> 

<221> SITE 
<222> (8) 

<223> It is acetyl-lysine. 
<400> 6 

Gin Ser Thr Ser Arg His Lys Xaa Leu Met Phe Lys Thr Glu 
15 10 



<210> 7 
<211> 110 
<212> PRT 

<213> Homo sapiens 
<400> 7 

Ser Lys Glu Pro Arg Asp Pro Asp Gin Leu Tyr Ser Thr Leu Lys Ser 
15 10 15 

lie Leu Gin Gin Val Lys Ser His Gin Ser Ala Trp Pro Phe Met Glu 
20 25 30 

Pro Val Lys Arg Thr Glu Ala Pro Gly Tyr Tyr Glu Val lie Arg Ser 
35 40 45 

Pro Met Asp Leu Lys Thr Met Ser Glu Arg Leu Lys Asn Arg Tyr Tyr 
50 55 60 

Val Ser Lys Lys Leu Phe Met Ala Asp Leu Gin Arg Val Phe Thr Asn 
65 70 75 80 

Cys Lys Glu Tyr Asn Ala Pro Glu Ser Glu Tyr Tyr Lys Cys Ala Asn 
85 90 95 

lie Leu Glu Lys Phe Phe Phe Ser Lys lie Lys Glu Ala Gly 
100 105 110 



<210> 8 
<211> 110 
<212> PRT 

<213> Homo sapiens 
<400> 8 

Gly Lys Glu Leu Lys Asp Pro Asp Gin Leu Tyr Thr Thr Leu Lys Asn 
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1 



5 



10 



15 



Leu Leu Ala Gin lie Lys Ser His 
20 

Pro Val Lys Lys Ser Glu Ala Pro 
35 40 

Pro lie Asp Leu Lys Thr Met Thr 
50 55 

Val Thr Arg Lys Leu Phe Val Ala 
65 70 

Cys Arg Glu Tyr Asn Pro Pro Asp 
85 

Ala Leu Glu Lys Phe Phe Tyr Phe 
100 



Pro Ser Ala Trp Pro Phe Met Glu 
25 30 

Asp Tyr Tyr Glu Val lie Arg Phe 
45 

Glu Arg Leu Arg Ser Arg Tyr Tyr 
60 

Asp Leu Gin Arg Val lie Ala Asn 
75 80 

Ser Glu Tyr Cys Arg Cys Ala Ser 
90 95 

Lys Leu Lys Glu Gly Gly 
105 110 



<210> 9 
<211> 109 
<212> PRT 

<213> Tetrahymena thermophila 
<400> 9 

Leu Lys Lys Ser Lys Glu Arg Ser Phe Asn Leu Gin Cys Ala Asn Val 
15 10 15 

lie Glu Asn Met Lys Arg His Lys Gin Ser Trp Pro Phe Leu Asp Pro 
20 25 30 

Val Asn Lys Asp Asp Val Pro Asp Tyr Tyr Asp Val lie Thr Asp Pro 
35 40 45 

lie Asp lie Lys Ala lie Glu Lys Lys Leu Gin Asn Asn Gin Tyr Val 
50 55 60 

Asp Lys Asp Gin Phe lie Lys Asp Val Lys Arg He Phe Thr Asn Ala 
65 70 75 80 

Lys He Tyr Asn Gin Pro Asp Thr He Tyr Tyr Lys Ala Ala Lys Glu 
85 90 95 

Leu Glu Asp Phe Val Glu Pro Tyr Leu Thr Lys Leu Lys 
100 105 
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<210> 10 
<211> 109 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 10 

Ala Gin Arg Pro Lys Arg Gly Pro His Asp Ala Ala lie Gin Asn lie 
15 10 15 

Leu Thr Glu Leu Gin Asn His Ala Ala Ala Trp Pro Phe Leu Gin Pro 
20 25 30 

Val Asn Lys Glu Glu Val Pro Asp Tyr Tyr Asp Phe lie Lys Glu Pro 
35 40 45 

Met Asp Leu Ser Thr Met Glu lie Lys Leu Glu Ser Asn Lys Tyr Gin 
50 55 60 

Lys Met Glu Asp Phe lie Tyr Asp Ala Arg Leu Val Phe Asn Asn Cys 
65 70 75 80 

Arg Met Tyr Asn Gly Glu Asn Thr Ser Tyr Tyr Lys Tyr Ala Asn Arg 
85 90 95 

Leu Glu Lys Phe Phe Asn Asn Lys Val Lys Glu lie Pro 
100 105 



<210> 11 
<211> 112 
<212> PRT 

<213> Homo sapiens 
<400> 11 

Lys Lys lie Phe Lys Pro Glu Glu Leu Arg Gin Ala Leu Met Pro Thr 
15 10 15 

Leu Glu Ala Leu Tyr Arg Gin Asp Pro Glu Ser Leu Pro Phe Arg Gin 
20 25 30 

Pro Val Asp Pro Gin Leu Leu Gly lie Pro Asp Tyr Phe Asp lie Val 
35 40 45 

Lys Ser Pro Met Asp Leu Ser Thr lie Lys Arg Lys Leu Asp Thr Gly 
50 55 60 

Gin Tyr Gin Glu Pro Trp Gin Tyr Val Asp Asp lie Trp Leu Met Phe 
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65 



70 



75 



80 



Asn Asn Ala Trp Leu Tyr Asn Arg Lys Thr Ser Arg Val Tyr Lys Tyr 
85 90 95 

Cys Ser Lys Leu Ser Glu Val Phe Glu Gin Glu lie Asp Pro Val Met 
100 105 110 



<210> 12 
<211> 112 
<212> PRT 

<213> Homo sapiens 
<400> 12 

Lys Lys lie Phe Lys Pro Glu Glu Leu Arg Gin Ala Leu Met Pro Thr 
15 10 15 

Leu Glu Ala Leu Tyr Arg Gin Asp Pro Glu Ser Leu Pro Phe Arg Gin 
20 25 30 

Pro Val Asp Pro Gin Leu Leu Gly lie Pro Asp Tyr Phe Asp lie Val 
35 40 45 

Lys Asn Pro Met Asp Leu Ser Thr lie Lys Arg Lys Leu Asp Thr Gly 
50 55 60 

Gin Tyr Gin Glu Pro Trp Gin Tyr Val Asp Asp Val Trp Leu Met Phe 
65 70 75 80 

Asn Asn Ala Trp Leu Tyr Asn Arg Lys Thr Ser Arg Val Tyr Lys Phe 
85 90 95 

Cys Ser Lys Leu Ala Glu Val Phe Glu Gin Glu lie Asp Pro Val Met 
100 105 110 



<210> 13 
<211> 112 
<212> PRT 

<213> Mus musculus 
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<400> 13 

Lys Lys lie Phe Lys Pro Glu Glu Leu Arg Gin Ala Leu Met Pro Thr 
15 10 15 



Leu Glu Ala Leu Tyr Arg Gin Asp 
20 

Pro Val Asp Pro Gin Leu Leu Gly 
35 40 

Lys Asn Pro Met Asp Leu Ser Thr 
50 55 

Gin Tyr Gin Glu Pro Trp Gin Tyr 
65 70 

Asn Asn Ala Trp Leu Tyr Asn Arg 
85 

Cys Ser Lys Leu Ala Glu Val Phe 
100 



Pro Glu Ser Leu Pro Phe Arg Gin 
25 30 

lie Pro Asp Tyr Phe Asp lie Val 
45 

lie Lys Arg Lys Leu Asp Thr Gly 
60 

Val Asp Asp Val Arg Leu Met Phe 
75 80 

Lys Thr Ser Arg Val Tyr Lys Phe 
90 95 

Glu Gin Glu lie Asp Pro Val Met 
105 110 



<210> 14 
<211> 111 
<212> PRT 

<213> Caenorhabd.it is elegans 
<400> 14 

Asp Thr Val Phe Ser Gin Glu Asp Leu lie Lys Phe Leu Leu Pro Val 
15 10 15 

Trp Glu Lys Leu Asp Lys Ser Glu Asp Ala Ala Pro Phe Arg Val Pro 
20 25 30 

Val Asp Ala Lys Leu Leu Asn lie Pro Asp Tyr His Glu lie lie Lys 
35 40 45 

Arg Pro Met Asp Leu Glu Thr Val His Lys Lys Leu Tyr Ala Gly Gin 
50 55 60 

Tyr Gin Asn Ala Gly Gin Phe Cys Asp Asp lie Trp Leu Met Leu Asp 
65 70 75 80 

Asn Ala Trp Leu Tyr Asn Arg Lys Asn Ser Lys Val Tyr Lys Tyr Gly 
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90 



95 



Leu Lys Leu Ser Glu Met Phe Val Ser Glu Met Asp Pro Val Met 
100 105 110 



<210> 15 
<211> 110 
<212> PRT 

<213> Homo sapiens 
<400> 15 

Arg Arg Arg Thr Asp Pro Met Val Thr Leu Ser Ser lie Leu Glu Ser 
15 10 15 

lie lie Asn Asp Met Arg Asp Leu Pro Asn Thr Tyr Pro Phe His Thr 
20 25 30 

Pro Val Asn Ala Lys Val Val Lys Asp Tyr Tyr Lys lie lie Thr Arg 
35 40 45 

Pro Met Asp Leu Gin Thr Leu Arg Glu Asn Val Arg Lys Arg Leu Tyr 
50 55 60 

Pro Ser Arg Glu Glu Phe Arg Glu His Leu Glu Leu lie Val Lys Asn 
65 70 75 80 

Ser Ala Thr Tyr Asn Gly Pro Lys His Ser Leu Thr Gin lie Ser Gin 
85 90 95 

Ser Met Leu Asp Leu Cys Asp Glu Lys Leu Lys Glu Lys Glu 
100 105 110 



<210> 16 
<211> 110 
<212> PRT 

<213> Mesocricetus auratus 
<400> 16 

Arg Arg Arg Thr Asp Pro Met Val Thr Leu Ser Ser lie Leu Glu Ser 
15 10 15 

lie lie Asn Asp Met Arg Asp Leu Pro Asn Thr Tyr Pro Phe His Thr 
20 25 30 

Pro Val Asn Ala Lys Val Val Lys Asp Tyr Tyr Lys lie lie Thr Arg 
35 40 45 
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Pro Met Asp Leu Gin Thr Leu Arg Glu Asn Val Arg Lys Arg Leu Tyr 
50 55 60 



Pro Ser Arg Glu Glu Phe Arg Glu His Leu Glu Leu lie Val Lys Asn 
65 70 75 80 

Ser Ala Thr Tyr Asn Gly Pro Lys His Ser Leu Thr Gin lie Ser Gin 
85 90 95 

Ser Met Leu Asp Leu Cys Asp Glu Lys Leu Lys Glu Lys Glu 
100 105 110 



<210> 17 
<211> 111 
<212> PRT 

<213> Homo sapiens 
<400> 17 

Leu Leu Asp Asp Asp Asp Gin Val Ala Phe Ser Phe lie Leu Asp Asn 
15 10 15 

lie Val Thr Gin Lys Met Met Ala Val Pro Asp Ser Trp Pro Phe His 
20 25 30 

His Pro Val Asn Lys Lys Phe Val Pro Asp Tyr Tyr Lys Val lie Val 
35 40 45 

Asn Pro Met Asp Leu Glu Thr lie Arg Lys Asn lie Ser Lys His Lys 
50 55 60 

Tyr Gin Ser Arg Glu Ser Phe Leu Asp Asp Val Asn Leu lie Leu Ala 
65 70 75 80 

Asn Ser Val Lys Tyr Asn Gly Pro Glu Ser Gin Tyr Thr Lys Thr Ala 
85 90 95 

Gin Glu lie Val Asn Val Cys Tyr Gin Thr Leu Thr Glu Tyr Asp 
100 105 110 



<210> 18 
<211> 111 
<212> PRT 

<213> Mesocricetus auratus 
<400> 18 
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Leu Leu Asp Asp 
1 

Tie Val Thr Gin 
20 

His Pro Val Asn 
35 

Ser Pro Met Asp 
50 

Tyr Gin Ser Arg 
65 

Asn Ser Val Lys 



Gin Glu lie Val 
100 



Asp Asp Gin Val 
5 

Lys Met Met Ala 



Lys Lys Phe Val 
40 

Leu Glu Thr lie 
55 

Glu Ser Phe Leu 
70 

Tyr Asn Gly Ser 
85 

Asn Val Cys Tyr 



Ala Phe Ser Phe 
10 

Val Pro Asp Ser 
25 

Pro Asp Tyr Tyr 



Arg Lys Asn lie 
60 

Asp Asp Val Asn 
75 

Glu Ser Gin Tyr 
90 

Gin Thr Leu Thr 
105 



lie Leu Asp Asn 
15 

Trp Pro Phe His 
30 

Lys Val lie Val 
45 

Ser Lys His Lys 



Leu lie Leu Ala 
80 

Thr Lys Thr Ala 
95 

Glu Tyr Asp 
110 



<210> 19 
<211> 111 
<212> PRT 

<213> Homo sapiens 
<400> 19 

Lys Pro Gly Arg Val Thr Asn Gin Leu Gin Tyr Leu His Lys Val Val 
15 10 15 

Met Lys Ala Leu Trp Lys His Gin Phe Ala Trp Pro Phe Arg Gin Pro 
20 25 30 

Val Asp Ala Val Lys Leu Gly Leu Pro Asp Tyr His Lys lie lie Lys 
35 40 45 

Gin Pro Met Asp Met Gly Thr lie Lys Arg Arg Leu Glu Asn Asn Tyr 
50 55 60 

Tyr Trp Ala Ala Ser Glu Cys Met Gin Asp Phe Asn Thr Met Phe Thr 
65 70 75 80 

Asn Cys Tyr lie Tyr Asn Lys Pro Thr Asp Asp lie Val Leu Met Ala 
85 90 95 

Gin Thr Leu Glu Lys lie Phe Leu Gin Lys Val Ala Ser Met Pro 
100 105 110 
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<210> 20 
<211> 111 
<212> PRT 

<213> Homo sapiens 

<400> 20 
Lys Pro Gly Arg 
1 

Val Lys Thr Leu 
20 

Val Asp Ala lie 
35 

Asn Pro Met Asp 
50 

Tyr Trp Ser Ala 
65 

Asn Cys Tyr lie 



Gin Ala Leu Glu 
100 



Lys Thr Asn Gin 
5 

Trp Lys His Gin 



Lys Leu Asn Leu 
40 

Met Gly Thr lie 
55 

Ser Glu Cys Met 
70 

Tyr Asn Lys Pro 
85 

Lys lie Phe Leu 



Leu Gin Tyr Met 
10 

Phe Ala Trp Pro 
25 

Pro Asp Tyr His 



Lys Lys Arg Leu 
60 

Gin Asp Phe Asn 
75 

Thr Asp Asp lie 
90 

Gin Lys Val Ala 
105 



Gin Asn Val Val 
15 

Phe Tyr Gin Pro 
30 

Lys lie lie Lys 
45 

Glu Asn Asn Tyr 



Thr Met Phe Thr 
80 

Val Leu Met Ala 
95 

Gin Met Pro 
110 



<210> 21 
<211> 111 
<212> PRT 

<213> Drosophila melanogaster 
<400> 21 

Arg Pro Gly Arg Asn Thr Asn Gin Leu Gin Tyr Leu lie Lys Thr Val 
15 10 15 

Met Lys Val lie Trp Lys His His Phe Ser Trp Pro Phe Gin Gin Pro 
20 25 30 

Val Asp Ala Lys Lys Leu Asn Leu Pro Asp Tyr His Lys lie lie Lys 
35 40 45 

Gin Pro Met Asp Met Gly Thr lie Lys Lys Arg Leu Glu Asn Asn Tyr 
50 55 60 
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Tyr Trp Ser Ala 
65 

Asn Cys Tyr Val 



Gin Thr Leu Glu 
100 



Lys Glu Thr lie 
70 

Tyr Asn Lys Pro 
85 

Lys Val Phe Leu 



Gin Asp Phe Asn 
75 

Gly Glu Asp Val 
90 

Gin Lys lie Glu 
105 



Thr Met Phe Asn 
80 

Val Val Met Ala 
95 

Ser Met Pro 
110 



<210> 22 
<211> 109 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 22 

Asn Pro lie Pro Lys His Gin Gin Lys His Ala Leu Leu Ala lie Lys 
15 10 15 

Ala Val Lys Arg Leu Lys Asp Ala Arg Pro Phe Leu Gin Pro Val Asp 
20 25 30 

Pro Val Lys Leu Asp lie Pro Phe Tyr Phe Asn Tyr lie Lys Arg Pro 
35 40 45 

Met Asp Leu Ser Thr lie Glu Arg Lys Leu Asn Val Gly Ala Tyr Glu 
50 55 60 

Val Pro Glu Gin lie Thr Glu Asp Phe Asn Leu Met Val Asn Asn Ser 
65 70 75 80 

lie Lys Phe Asn Gly Pro Asn Ala Gly lie Ser Gin Met Ala Arg Asn 
85 90 95 

lie Gin Ala Ser Phe Glu Lys His Met Leu Asn Met Pro 
100 105 



<210> 23 
<211> 113 
<212> PRT 

<213> Homo sapiens 
<400> 23 

Lys Lys Gly Lys Leu Ser Glu Gin Leu Lys His Cys Asn Gly lie Leu 
15 10 15 

Lys Glu Leu Leu Ser Lys Lys His Ala Ala Tyr Ala Trp Pro Phe Tyr 
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20 



25 



30 



Lys Pro Val Asp Ala Ser Ala Leu 
35 40 

lie Lys His Pro Met Asp Leu Ser 
50 55 

Arg Asp Tyr Arg Asp Ala Gin Glu 
65 70 

Phe Ser Asn Cys Tyr Lys Tyr Asn 
85 

Met Ala Arg Lys Leu Gin Asp Val 
100 

Pro 



Gly Leu His Asp Tyr His Asp lie 
45 

Thr Val Lys Arg Lys Met Glu Asn 
60 

Phe Ala Ala Asp Val Arg Leu Met 
75 80 

Pro Pro Asp His Asp Val Val Ala 
90 95 

Phe Glu Phe Arg Tyr Ala Lys Met 
105 110 



<210> 24 
<211> 113 
<212> PRT 

<213> Homo sapiens 
<400> 24 

Lys Lys Gly Lys Leu Ser Glu His Leu Arg Tyr Cys Asp Ser lie Leu 
15 10 15 

Arg Glu Met Leu Ser Lys Lys His Ala Ala Tyr Ala Trp Pro Phe Tyr 
20 25 30 

Lys Pro Val Asp Ala Glu Ala Leu Glu Leu His Asp Tyr His Asp lie 
35 40 45 

lie Lys His Pro Met Asp Leu Ser Thr Val Lys Arg Lys Met Asp Gly 
50 55 60 

Arg Glu Tyr Pro Asp Ala Gin Gly Phe Ala Ala Asp Val Arg Leu Met 
65 70 75 80 

Phe Ser Asn Cys Tyr Lys Tyr Asn Pro Pro Asp His Glu Val Val Ala 
85 90 95 

Met Ala Arg Lys Leu Gin Asp Val Phe Glu Met Arg Phe Ala Lys Met 
100 105 110 



18 



Pro 



<210> 25 
<211> 113 
<212> PRT 

<213> Drosophila melanogaster 
<400> 25 

Asn Lys Glu Lys Leu Ser Asp Ala Leu Lys Ser Cys Asn Glu lie Leu 
15 10 15 

Lys Glu Leu Phe Ser Lys Lys His Ser Gly Tyr Ala Trp Pro Phe Tyr 
20 25 30 

Lys Pro Val Asp Ala Glu Met Leu Gly Leu His Asp Tyr His Asp lie 
35 40 45 

lie Lys Lys Pro Met Asp Leu Gly Thr Val Lys Arg Lys Met Asp Asn 
50 55 60 

Arg Glu Tyr Lys Ser Ala Pro Glu Phe Ala Ala Asp Val Arg Leu lie 
65 70 75 80 

Phe Thr Asn Cys Tyr Lys Tyr Asn Pro Pro Asp His Asp Val Val Ala 
85 90 95 

Met Gly Arg Lys Leu Gin Asp Val Phe Glu Met Arg Tyr Ala Asn lie 
100 105 110 

Pro 



<210> 26 
<211> 113 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 26 

Lys Ser Lys Arg Leu Gin Gin Ala Met Lys Phe Cys Gin Ser Val Leu 
15 10 15 

Lys Glu Leu Met Ala Lys Lys His Ala Ser Tyr Asn Tyr Pro Phe Leu 
20 25 30 

Glu Pro Val Asp Pro Val Ser Met Asn Leu Pro Thr Tyr Phe Asp Tyr 
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35 



40 



45 



Val Lys Glu Pro Met Asp Leu Gly Thr lie Ala Lys Lys Leu Asn Asp 
50 55 60 

Trp Gin Tyr Gin Thr Met Glu Asp Phe Glu Arg Glu Val Arg Leu Val 
65 70 75 80 

Phe Lys Asn Cys Tyr Thr Phe Asn Pro Asp Gly Thr lie Val Asn Met 
85 90 95 

Met Gly His Arg Leu Glu Glu Val Phe Asn Ser Lys Trp Ala Asp Arg 
100 105 110 



Pro 



<210> 27 
<211> 108 
<212> PRT 

<213> Homo sapiens 
<400> 27 

Met Glu Met Gin Leu Thr Pro Phe Leu lie Leu Leu Arg Lys Thr Leu 
15 10 15 

Glu Gin Leu Gin Glu Lys Asp Thr Gly Asn lie Phe Ser Glu Pro Val 
20 25 30 

Pro Leu Ser Glu Val Pro Asp Tyr Leu Asp His lie Lys Lys Pro Met 
35 40 45 

Asp Phe Phe Thr Met Lys Gin Asn Leu Glu Ala Tyr Arg Tyr Leu Asn 
50 55 60 

Phe Asp Asp Phe Glu Glu Asp Phe Asn Leu lie Val Ser Asn Cys Leu 
65 70 75 80 

Lys Tyr Asn Ala Lys Asp Thr lie Phe Tyr Arg Ala Ala Val Arg Leu 
85 90 95 

Arg Glu Gin Gly Gly Ala Val Val Arg Gin Ala Arg 
100 105 



<210> 28 
<211> 113 



20 



<212> PRT 

<213> Homo sapiens 



<400> 28 
Ser Glu Asp Gin 
1 

lie Met Leu Val 
20 

Phe Leu Gin Pro 
35 

Val Gin Arg Pro 
50 

Gly Leu lie Arg 
65 

Phe Gin Asn Ala 



Met Ala Val Glu 
100 



Glu Ala lie Gin 
5 

Trp Arg Ala Ala 



Val Thr Asp Asp 
40 

Met Asp Leu Ser 
55 

Ser Thr Ala Glu 
70 

Val Met Tyr Asn 
85 

Met Gin Arg Asp 



Ala Gin Lys lie 
10 

Ala Asn His Arg 
25 

lie Ala Pro Gly 



Thr lie Lys Lys 
60 

Phe Gin Arg Asp 
75 

Ser Ser Asp His 
90 

Val Leu Glu Gin 
105 



Trp Lys Lys Ala 
15 

Tyr Ala Asn Val 
30 

Tyr His Ser lie 
45 

Asn lie Glu Asn 



lie Met Leu Met 
80 

Asp Val Tyr His 
95 

lie Gin Gin Phe 
110 



Leu 



<210> 29 
<211> 106 
<212> PRT 

<213> Gallus gallus 
<400> 29 

Asn Leu Pro Thr Val Asp Pro lie Ala Val Cys His Glu Leu Tyr Asn 
15 10 15 

Thr lie Arg Asp Tyr Lys Asp Glu Gin Gly Arg Leu Leu Cys Glu Leu 
20 25 30 

Phe lie Arg Ala Pro Lys Arg Arg Asn Gin Pro Asp Tyr Tyr Glu Val 
35 40 45 

Val Ser Gin Pro lie Asp Leu Met Lys lie Gin Gin Lys Leu Lys Met 
50 55 60 

Glu Glu Tyr Asp Asp Val Asn Val Leu Thr Ala Asp Phe Gin Leu Leu 
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65 



70 



75 



80 



Phe Asn Asn Ala Lys Ala Tyr Tyr Lys Pro Asp Ser Pro Glu Tyr Lys 
85 90 95 

Ala Ala Cys Lys Leu Trp Glu Leu Tyr Leu 
100 105 



<210> 30 
<211> 112 
<212> PRT 

<213> Gallus gallus 
<400> 30 

Ser Ser Pro Gly Tyr Leu Lys Glu lie Leu Glu Gin Leu Leu Glu Ala 
15 10 15 

Val Ala Val Ala Thr Asn Pro Ser Gly Arg Leu lie Ser Glu Leu Phe 
20 25 30 

Gin Lys Leu Pro Ser Lys Val Gin Tyr Pro Asp Tyr Tyr Ala lie lie 
35 40 45 

Lys Glu Pro He Asp Leu Lys Thr He Ala Gin Arg He Gin Asn Gly 
50 55 60 

Thr Tyr Lys Ser He His Ala Met Ala Lys Asp He Asp Leu Leu Ala 
65 70 75 80 

Lys Asn Ala Lys Thr Tyr Asn Glu Pro Gly Ser Gin Val Phe Lys Asp 
85 90 95 

Ala Asn Ala He Lys Lys He Phe Asn Met Lys Lys Ala Glu He Glu 
100 105 110 



<210> 31 
<211> 112 
<212> PRT 

<213> Gallus gallus 
<400> 31 

Thr Ser Phe Met Asp Thr Ser Asn Pro Leu Tyr Gin Leu Tyr Asp Thr 
15 10 15 
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Val Arg Ser Cys Arg Asn Asn Gin Gly Gin Leu lie Ser Glu Pro Phe 
20 25 30 



Phe Gin Leu Pro Ser Lys Lys Lys Tyr Pro Asp Tyr Tyr Gin Gin lie 
35 40 45 

Lys Thr Pro lie Ser Leu Gin Gin lie Arg Ala Lys Leu Lys Asn His 
50 55 60 

Glu Tyr Glu Thr Leu Asp Gin Leu Glu Ala Asp Leu Asn Leu Met Phe 
65 70 75 80 

Glu Asn Ala Lys Arg Tyr Asn Val Pro Asn Ser Ala lie Tyr Lys Arg 
85 90 95 

Val Leu Lys Met Gin Gin Val Met Gin Ala Lys Lys Lys Glu Leu Ala 
100 105 110 



<210> 32 
<211> 113 
<212> PRT 

<213> Gallus gallus 
<400> 32 

Ser Lys Lys Asn Met Arg Lys Gin Arg Met Lys lie Leu Tyr Asn Ala 
15 10 15 

Val Leu Glu Ala Arg Glu Ser Gly Thr Gin Arg Arg Leu Cys Asp Leu 
20 25 30 

Phe Met Val Lys Pro Ser Lys Lys Asp Tyr Pro Asp Tyr Tyr Lys lie 
35 40 45 

3le Leu Glu Pro Met Asp Leu Lys Met lie Glu His Asn lie Arg Asn 
50 55 60 

Asp Lys Tyr Val Gly Glu Glu Ala Met lie Asp Asp Met Lys Leu Met 
65 70 75 80 

Phe Arg Asn Ala Arg His Tyr Asn Glu Glu Gly Ser Gin Val Tyr Asn 
85 90 95 

Asp Ala His Met Leu Glu Lys lie Leu Lys Glu Lys Arg Lys Glu Leu 
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100 



105 



110 



Gly 



<210> 33 
<211> 115 
<212> PRT 

<213> Gallus gallus 
<400> 33 

Lys Lys Ser Lys Tyr Met Thr Pro Met Gin Gin Lys Leu Asn Glu Val 
15 10 15 

Tyr Glu Ala Val Lys Asn Tyr Thr Asp Lys Arg Gly Arg Arg Leu Ser 
20 25 30 

Ala lie Phe Leu Arg Leu Pro Ser Arg Ser Glu Leu Pro Asp Tyr Tyr 
35 40 45 

lie Thr lie Lys Lys Pro Val Asp Met Glu Lys lie Arg Ser His Met 
50 55 60 

Met Ala Asn Lys Tyr Gin Asp lie Asp Ser Met Val Glu Asp Phe Val 
65 70 75 80 

Met Met Phe Asn Asn Ala Cys Thr Tyr Asn Glu Pro Glu Ser Leu lie 
85 90 95 

Tyr Lys Asp Ala Leu Val Leu His Lys Val Leu Leu Glu Thr Arg Arg 
100 105 110 

Glu lie Glu 
115 



<210> 34 
<211> 112 
<212> PRT 
< 2 1 3 > Unknown 

<220> 

<223> Description of Unknown Organism: Cited from 

Jeanmougin et al . , Trends in Biochemical Sciences, 
22:151-153 (1997) 

<400> 34 
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His Asn Ala Pro 
1 

Leu Val Gly Leu 
20 

Glu Glu Leu Pro 
35 

Gin Lys Pro lie 
50 

Lys Tyr Leu Ser 
65 

Ser Asn Ala Gin 



Ser Val Leu lie 
100 



Phe Asp Lys Thr 
5 

Lys Asp Asn Glu 



Ser Lys Arg Tyr 
40 

Cys Tyr Lys Met 
55 

Met Gly Asp Phe 
70 

Thr Tyr Asn Met 
85 

Ala Asn Thr Ala 



Lys Phe Asp Glu 
10 

Gly Asn Pro Phe 
25 

Phe Pro Asp Tyr 



Met Arg Asn Lys 
60 

Tyr Asp Asp lie 
75 

Pro Gly Ser Leu 
90 

Asn Ser Leu Glu 
105 



Val Leu Glu Ala 
15 

Asp Asp lie Phe 
30 

Tyr Gin lie lie 
45 

Ala Lys Thr Gly 



Arg Leu Met Val 
80 

Val Tyr Glu Cys 
95 

Ser Lys Asp Gly 
110 



<210> 35 
<211> 113 
<212> PRT 
<213> Unknown 

<220> 

<22 3> Description of Unknown Organism: Cited from 

Jeanmougin et al . , Trends in Biochemical Sciences, 
22:151-153 (1997) 

<400> 35 

Gly Thr Asn Glu lie Asp Val Pro Lys Val lie Gin Asn lie Leu Asp 
15 10 15 

Ala Leu His Glu Glu Lys Asp Glu Gin Gly Arg Phe Leu lie Asp lie 
20 25 30 

Phe lie Asp Leu Pro Ser Lys Arg Leu Tyr Pro Asp Tyr Tyr Glu lie 
35 40 45 

lie Lys Ser Pro Met Thr lie Lys Met Leu Glu Lys Arg Phe Lys Lys 
50 55 60 
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Gly Glu Tyr Thr Thr Leu Glu Ser Phe Val Lys Asp Leu Asn Gin Met 
65 70 75 80 



Phe lie Asn Ala Lys Thr Tyr Asn Ala Pro Gly Ser Phe Val Tyr Glu 
85 90 95 

Asp Ala Glu Lys Leu Ser Gin Leu Ser Ser Ser Leu lie Ser Ser Phe 
100 105 110 

Ser 



<210> 36 
<211> 113 
<212> PRT 

<213> Homo sapiens 
<400> 36 

Gly Thr Asn Glu lie Asp Val Pro Lys Val lie Gin Asn lie Leu Asp 
15 10 15 

Ala Leu His Glu Glu Lys Asp Glu Gin Gly Arg Phe Leu lie Asp lie 
20 25 30 

Phe lie Asp Leu Pro Ser Lys Arg Leu Tyr Pro Asp Tyr Tyr Glu lie 
35 40 45 

lie Lys Ser Pro Met Thr lie Lys Met Leu Glu Lys Arg Phe Lys Lys 
50 55 60 

Gly Glu Tyr Thr Thr Leu Glu Ser Phe Val Lys Asp Leu Asn Gin Met 
65 70 75 - 80 

Phe lie Asn Ala Lys Thr Tyr Asn Ala Pro Gly Ser Phe Val Tyr Glu 
85 90 95 

Asp Ala Glu Lys Leu Ser Gin Leu Ser Ser Ser Leu lie Ser Ser Phe 
100 105 110 

Ser 



<210> 37 
<211> 114 
<212> PRT 

<213> Homo sapiens 
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<400> 37 
Ser Pro Asn Pro 
1 

Ala Val lie Lys 
20 

Val Phe lie Gin 
35 

Leu lie Arg Lys 
50 

Asn His Lys Tyr 
65 

Leu Cys Gin Asn 



Glu Asp Ser lie 
100 



Pro Asn Leu Thr 
5 

Tyr Lys Asp Ser 



Leu Pro Ser Arg 
40 

Pro Val Asp Phe 
55 

Arg Ser Leu Asn 
70 

Ala Gin Thr Phe 
85 

Val Leu Gin Ser 



Lys Lys Met Lys 
10 

Ser Ser Gly Arg 
25 

Lys Glu Leu Pro 



Lys Lys lie Lys 
60 

Asp Leu Glu Lys 
75 

Asn Leu Glu Gly 
90 

Val Phe Thr Ser 
105 



Lys lie Val Asp 
15 

Gin Leu Ser Glu 
30 

Glu Tyr Tyr Glu 
45 

Glu Arg lie Arg 



Asp Val Met Leu 
80 

Ser Leu lie Tyr 
95 

Val Arg Gin Lys 
110 



lie Glu 



<210> 38 
<211> 113 
<212> PRT 

<213> Gallus gallus 
<400> 38 

Ser Pro Asn Pro Pro Lys Leu Thr Lys Gin Met Asn Ala lie lie Asp 
15 10 15 

Thr Val lie Asn Tyr Lys Asp Ser Ser Gly Arg Gin Leu Ser Glu Val 
20 25 30 

Phe lie Gin Leu Pro Ser Arg Lys Glu Leu Pro Glu Tyr Tyr Glu Leu 
35 40 45 

lie Arg Lys Pro Val Asp Phe Lys Lys lie Lys Glu Arg lie Arg Asn 
50 55 60 

His Lys Tyr Arg Ser Leu Gly Asp Leu Glu Lys Asp Val Met Leu Leu 
65 70 75 80 
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Cys His Asn Ala Gin Thr Phe Asn Leu Glu Gly Ser Gin lie Tyr Glu 
85 90 95 



Asp Ser lie Val Leu Gin Ser Val Phe Lys Ser Ala Arg Gin Lys lie 
100 105 110 



Ala 



<210> 39 
<211> 114 
<212> PRT 

<213> Gallus gallus 
<400> 39 

Ser Pro Asn Pro Pro Asn Leu Thr Lys Lys Met Lys Lys lie Val Asp 
15 10 15 

Ala Val He Lys Tyr Lys Asp Ser Ser Ser Gly Arg Gin Leu Ser Glu 
20 25 30 

Val Phe He Gin Leu Pro Ser Arg Lys Glu Leu Pro Glu Tyr Tyr Glu 
35 40 45 

Leu lie Arg Lys Pro Val Asp Phe Lys Lys He Lys Glu Arg lie Arg 
50 55 60 

Asn His Lys Tyr Arg Ser Leu Asn Asp Leu Glu Lys Asp Val Met Leu 
65 70 75 80 

Leu Cys Gin Asn Ala Gin Thr Phe Asn Leu Glu Val Ser Leu He Tyr 
85 90 95 

Glu Asp Ser He Val Leu Gin Ser Val Phe Thr Ser Val Arg Gin Lys 
100 105 110 

He Glu 



<210> 40 
<211> 105 
<212> PRT 

<213> Homo sapiens 
<400> 40 

Ala Lys Leu Ser Pro Ala Asn Gin Arg Lys Cys Glu Arg Val Leu Leu 
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1 



5 



10 



15 



Ala Leu Phe Cys His Glu Pro Cys 
20 

Asp Ser Thr Phe Ser Leu Asp Gin 
35 40 

Leu lie Arg Ala Arg Leu Gin Glu 
50 55 

Pro Gin Glu Phe Ala Gin Asp Val 
65 70 

Lys Leu Thr Glu Asp Lys Ala Asp 
85 

Arg Phe Phe Glu Thr Arg Met Asn 
100 



Arg Pro Leu His Gin Leu Ala Thr 
25 30 

Pro Gly Gly Thr Leu Asp Leu Thr 
45 

Lys Leu Ser Pro Pro Tyr Ser Ser 
60 

Gly Arg Met Phe Lys Gin Phe Asn 
75 80 

Val Gin Ser He He Gly Leu Gin 
90 95 

Glu 
105 



<210> 41 
<211> 105 
<212> PRT 

<213> Mus musculus 
<400> 41 

Ala Lys Leu Ser Pro 
1 5 

Ala Leu Phe Cys His 
20 

Asp Ser Thr Phe Ser 
35 

Leu He Arg Ala Arg 
50 

Pro Gin Glu Phe Ala 
65 

Lys Leu Thr Glu Asp 
85 

Arg Phe Phe Glu Thr 
100 



Ala Asn Gin Arg Lys Cys 
10 

Glu Pro Cys Arg Pro Leu 
25 

Met Glu Gin Pro Gly Gly 
40 

Leu Gin Glu Lys Leu Ser 
55 

Gin Asp Val Gly Arg Met 
70 75 

Lys Ala Asp Val Gin Ser 
90 

Arg Met Asn Asp 
105 



Glu Arg Val Leu Leu 
15 

His Gin Leu Ala Thr 
30 

Thr Leu Asp Leu Thr 
45 

Pro Pro Tyr Ser Ser 
60 

Phe Lys Gin Phe Asn 
80 

He He Gly Leu Gin 
95 
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<210> 42 
<211> 108 
<212> PRT 
<213> Mus sp . 

<400> 42 
Thr Lys Leu Thr 
1 

Phe Leu Tyr Cys 
20 

Leu Thr Val Pro 
35 

Ser Thr lie Lys 
50 

Pro Glu Asp Phe 
65 

Glu Phe Asn Glu 



Glu Ser Tyr Phe 
100 



Pro lie Asp Lys 
5 

His Glu Met Ser 



Asp Tyr Tyr Lys 
40 

Lys Arg Leu Gin 
55 

Val Ala Asp Phe 
70 

Pro Asp Ser Glu 
85 

Glu Glu Leu Leu 



Arg Lys Cys Glu 
10 

Leu Ala Phe Gin 
25 

lie lie Lys Asn 



Glu Asp Tyr Cys 
60 

Arg Leu lie Phe 
75 

Val Ala Asn Ala 
90 

Lys Asn Leu Tyr 
105 



Arg Leu Leu Leu 
15 

Asp Pro Val Pro 
30 

Pro Met Asp Leu 
45 

Met Tyr Thr Lys 



Gin Asn Cys Ala 
80 

Gly lie Lys Leu 
95 



<210> 43 
<211> 13 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: consencus 
<220> 

<221> VARIANT 
<222> (1) 

<223> It represents 2 amino acids. They can be any amino 
acids . 

<220> 

<221> VARIANT 
<222> (3) 

<223> It represents 2 to 3 amino acids. They can be any 
amino acids. 
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<220> 

<221> VARIANT 
<222> (5) 

<223> It represents 5 to 8 amino acids. They can be any 
amino acids . 

<220> 

<221> VARIANT 
<222> (7) 

<223> It represents one amino acids. It can be any amino 
acid. 

<220> 

<221> VARIANT 
<222> (10) 

<223> It represents 5 amino acids. They can be any amino 
acids . 

<220> 

<221> VARIANT 
<222> (6) 

<223> It represents any amino acid from the group of: P, 
K, or H. 

<220> 

<221> VARIANT 
<222> (9) 

<223> It represents any amino acid from the group of: Y, 
F, or H. 

<220> 

<221> VARIANT 
<222> (12) 

<223> It represents any amino acid from the group of: M, 
I, or V. 

<400> 43 

Xaa Phe Xaa Pro Xaa Xaa Xaa Tyr Xaa Xaa Pro Xaa Asp 
15 10 



<210> 44 
<211> 20 
<212> PRT 

<213> Artificial Sequence 
<220> 
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<223> Description of Artificial Sequence: consencus 
<400> 44 

Trp Pro Phe Met Glu Pro Val Lys Arg Thr Glu Ala Pro Gly Tyr Tyr 
15 10 15 

Glu Val lie Arg 
20 
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